matt_prigge
Contributing Editor

Red Hat Gluster, the cloud storage monster

analysis
Oct 24, 20119 mins

With the Gluster acquisition, Red Hat has renewed interest in distributed storage -- and become a big data storage player overnight

Earlier this month, Red Hat announced it had acquired Gluster, developer of the GlusterFS open source file system and the Gluster Storage Platform software stack. In so doing, Red Hat set itself up as a one-stop shop for those looking to deploy big data solutions such as Apache Hadoop. But it also bought a file system that has serious potential for cloud-based deployments. If you haven’t heard of Gluster yet, here’s a quick look at what makes it different than most other scale-out NAS solutions.

A quick tour of Gluster

In Gluster’s own words, GlusterFS is “a scalable open source clustered file system that offers a global namespace, distributed front-end, and scales to hundreds of petabytes without difficulty.” That’s a big claim, but GlusterFS is built to solve big problems — really big problems. In fact, Gluster’s maximum capacity is somewhere in the neighborhood of 72 brontobytes (yeah, that’s a real word).

Perhaps the most important detail to know right off the bat about GlusterFS is that it accomplishes absolutely massive scale-out NAS without one thing that pretty much everyone in the big data space uses: metadata. Metadata is the data that describes where a given file or block is located in a distributed file system; it’s also the Achilles’ heel of most scale-out NAS solutions.

In some cases, such as Hadoop’s native HDFS, metadata constitutes a dangerous single point of failure. In others, it’s a barrier to truly linear performance scalability, because all nodes must continuously stay in contact with the server(s) that hold the metadata for the entire cluster — which almost always results in additional latency and storage hardware that sits idle waiting for metadata requests to be fulfilled.

Gluster works around this problem through the use of its Elastic Hash Algorithm. Using this algorithm, every node in a Gluster cluster can compute the location of a given file without needing to contact any other node in the cluster — essentially doing away with the need to track and exchange metadata. That gives GlusterFS a huge leg up over its competition and allows it to actually deliver on the promise of linear performance scalability.

Back-end deployment

GlusterFS is a user-space filesystem driver that can be deployed on just about any brand of Linux (commonly RHEL or CentOS). In other words, GlusterFS is entirely hardware-independent and consequently very portable. In on-premise or private cloud implementations, GlusterFS can be built on top of commodity server hardware with JBOD, DAS, or SAN storage — leaving the choice of what hardware to use entirely up to the end-user. In public cloud environments, GlusterFS can be installed on top of existing product offerings to enable better scalability or survivability (both Amazon and Rightscale offer this right now). It is also distributed in an increasingly wide variety of virtual appliances, which allows Gluster nodes to be implemented on top of a hypervisor — either on-premise or in the cloud.

In terms of how data is stored within a cluster of GlusterFS nodes, Gluster can be deployed in several different models with different performance and availability characteristics. The simplest is a distribute-only mode that essentially emulates a file-level RAID0 distribution. In this model, files are stored on only one Gluster node, so the loss of a single node would result in data loss. Not surprisingly, it also offers the highest level of performance and makes most efficient use of storage, since there’s no file duplication.

For applications that require the ability to survive the loss of a node, Gluster can also be deployed in a distributed replica mode that resembles file-level RAID10. In this model, files are distributed over pairs of nodes that are synchronously mirrored. An individual node can be lost and replaced without file availability being impacted.

Finally, Gluster supports a striping mode that operates more akin to a standard block-level RAID0. This mode is generally only recommended in situations where very large files (typically exceeding 50GB) are being stored and where the performance of multiple nodes is required. This is the only mode that will ever divide a file and split it over multiple nodes — all other modes operate at a file level. Unfortunately, mirroring is not supported in combination with striping, so high availability must be built into the hardware if this method is to be used.

Alhough you can’t mix storage modes within the same Gluster cluster, it is possible to run multiple logical clusters on the same set of hardware. Thus, you could potentially run a distributed replica cluster in parallel with a striped cluster on the same physical hardware.

In addition to allowing the implementation of distributed replication within a Gluster cluster, it’s also possible to implement N-way geo-replication in between clusters. This method can be used to protect against the failure of an entire site or allow easy migration of applications from one site to another. Gluster geo-replication is very flexible, allowing replication models that include an arbitrary number of intermediate replicas (Site A to Site B, Site B to Sites C and D, for example).

It should be noted that it is possible to stretch a Gluster cluster across physical sites, but due to the synchronous nature of intracluster distributed replication, large amounts of WAN bandwidth and very low latency would be required to achieve reasonable performance. In practice, then, a single Gluster would generally be limited to a single site or metro area.

Client access

Gluster can allow client access via a wide array of different protocols, including the native Gluster client, NFS, CIFS, WebDAV, HTTP, and so on. However, only the native Gluster client can gracefully support highly available, massively parallel file access. Using the native client, all client systems actively attach to all cluster nodes simultaneously, are aware of their topology through a client-side implementation of the Elastic Hash Algorithm, and pull data directly from the nodes hosting the data being requested. Thus, client access with the native client never results in data being exchanged between Gluster nodes to satisfy client requests — and the failure of a mirrored node is entirely transparent to applications that depend on Gluster volumes.

Standards-based NFS and CIFS both suffer from serious limitations in that they aren’t natively capable of this kind of highly parallel access. As such, NFS and CIFS deployments require additional software to manage load balancing and high availability, because clients are only capable of connecting to a single storage node at any given time. This is usually managed with round-robin DNS or hardware load balancers in combination with UCARP (a lightweight form of VRRP) or CTDB (a Samba project that enables storage clustering).

Since clients are only attached to a single node at a time, read and write requests must be shuffled between that node and other nodes actually storing the data — a situation that can result in substantially decreased performance compared to using the native client. As a result, deployments using these protocols usually require a separate back-end network that is dedicated to handling the internode traffic necessary to respond to client requests.

Management

Gluster is managed through a combination of a Web GUI that ships with the bare-metal Gluster Storage Platform and a set of command-line tools that are available with the stand-alone GlusterFS distribution. As such, it’s best managed by those already familiar with Linux system administration. For someone with some Linux chops, it’s amazingly simple to use, requiring only a few quick commands to make fairly large changes such as adding a new node or creating a new volume. In fact, the well-known Internet radio company Pandora deployed a 250TB Gluster-based storage back end for its service and has only a single admin dedicated to managing it. If you have some Linux skills and an hour or two, you can implement Gluster. How many other clustered file systems can you say that about?

Applicability in the cloud

Aside from its obvious applicability in building a storage back end to support a cloud environment, Gluster has some neat applications within existing public cloud infrastructures. One of the challenges in building a highly available storage system using a cloud infrastructure like Amazon EC2 is that you really need to bring your own disaster recovery plan. While Amazon in particular offers great reliability for its S3 object-based storage platform, it cannot offer the same service levels for its online EBS (Elastic Block Storage) product that backs most EC2 compute instances. Additionally, EBS volumes are limited to 2TB in size, which can make it difficult to work with large datasets.

By implementing Gluster on EC2, you can scale beyond that 2TB limitation, implement your own mirroring within an EC2 availability zone, and even leverage Gluster geo-replication to get your data replicated to instances in a different EC2 availability zone, to a different cloud provider, or even back to your on-premise infrastructure. Granted, not everyone needs that kind of reliability and scalability, but for those who do, it’s potentially a great answer.

Putting it all together

More than a few analysts watching Red Hat’s acquisition of Gluster have noted the obvious big data applications, especially as they relate to upcoming HDFS and Amazon S3 API compatibility being added in GlusterFS 3.3. Yet Gluster has the potential to be more than just a good big data storage file system. With compatibility for a range of different hypervisors as well as upcoming support for OpenStack, Gluster may play a big part in the back-end infrastructure of clouds — public and private alike.

This article, “Red Hat Gluster, the cloud storage monster,” originally appeared at InfoWorld.com. Read more of Matt Prigge’s Information Overload blog and follow the latest developments in storage at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter.