bg-bottom-right-section-1 bg-top-right-section-1 bg-top-section-1

Big Data Management

Faction's Cloud Control Volumes (CCVs) offer the best platform for data lake for big data. It avoids the issues of building and scaling HDFS (Hadoop Distributed Filesystem) in order to persist the data, and avoids having to transform and pipeline data in and out of object storage to do public-cloud processing. As Cloud Control Volumes are natively multi-cloud-enabled, it allows for developers to select “best fit” cloud services and platforms and utilize the same data across multiple clouds.

 

SCROLL DOWN

Simplicity in Persisting Big Data.

Faction's Cloud Control Volumes are simple NFS mounts - so they are broadly compatible, both with Apache Spark deployed natively, as well as Hadoop and Spark on Hadoop utilizing NetApp's FAS NFS Connector for Hadoop. This simplifies the complexity of persisting big data, because you no longer need to manage a cluster with local drives forming an HDFS cluster. You only need to turn up your compute instances when you need to be doing processing. It also means you can use a smaller cluster for stream processing, but add additional resources (and utilize the same data lake) when adding an additional cluster to do processing on historical data.

Eliminating the need to constantly marshal data between long-term storage and current processing eliminates both cost and complexity for many use cases.

The Power of Multi-Cloud & Avoiding Lock-in.

As developers increasingly look to tap into the features and benefits of specific clouds, Faction Cloud Control Volumes offer a platform to make the same data available across multiple public clouds at the same time.

 

bg-top-content-solutions-2 bg-top-content-solutions-2 bg-top-content-solutions-2

Big Data Performance. 

The NFS Connector optimizes big data workloads for NFS by enabling large I/O sizes, enabling parallelized I/Os, and doing prefetching.

Faction's Cloud Control Volumes come in multiple tiers. Since the least expensive of these (Faction's Deep tier), with pricing at scale typically less than public cloud object or data lake storage, is still capable of delivering 2GB/s of throughput at petabyte-scale, it can be an incredibly cost-effective way to build a high-performance data lake.

Faction's Prime tier offers a performance increase as well as being SSD-enabled with a caching tier that makes it more performant when mixing reads/writes and when workloads are less sequential and throughput-optimized.

Faction's Flash tier offers extreme performance, delivering 2.5GB/s of throughput even on much smaller (~50TB) data sets.

The overall ability to handle processing is aided, however, because there is no requirement to marshal data in and out of a persistent storage layer - data mounted up as file-based storage can immediately begin processing. This also simplifies developer efforts.