by Greg Nawrocki

A Quick Primer on Linux Clustering from Donald Becker

news
Apr 7, 20065 mins

I recently had the opportunity to catch up with clustering pioneer Donald Becker, who started the Beowful project, and is now the CTO of Penguin Computing. I asked him a few questions about the history of Linux clustering, and here’s what he had to say …

Q: Tell us a little bit about the genesis of the Beowulf Project, and how Linux clustering has evolved since those early days.

Becker: Beowulf started out as a way for people to use collections of commodity, off-the-shelf Linux machines for high-performance computing … as an alternative to using purpose-built, specialized machines.

The real key to doing that is providing a software layer that hides, as much as possible, the “ugliness” of machines not designed for high-performance computing. So it’s about a software system and a methodology to put together machines that can be used effectively in high-performance computing.

From the beginning of the Beowulf project in 1994, we targeted Linux as the platform for the software we were deploying. At the time Linux had a very small presence in high-performance computing and the market in general. I like to think Beowulf had a strong influence with Linux becoming popular for doing clustering for the purpose of high-performance computing. This eventually led to Linux being a popular platform for doing all sorts of things in the HPC realm.

Q: So what were some of the specific challenges for clustering commodity Linux boxes for HPC?

Becker: In the early days, the challenge was as simple as getting the machines to talk to each other – so my background on the Linux side was contributing to the networking side of the Linux kernel. Getting the machines to communicate meant figuring out a lot of the high-throughput, low-latency communication requirements for clusters. From there, the focus turned to communication libraries and managing large sets of machines. One of the things about scientists is they’ll put up with quite a bit in terms of complex systems. The rest of the world, however, wanted to put complexity in the background — to minimize the complexity — because they’re much more focused on their own specific applications.

Q: What are some of the unique management issues in Linux cluster environments?

Becker: Our focus on the cluster management side is consistency. We’re trying to make it look like a single system from the point of view of the end user and the administrator. We want to guarantee that a process being run remotely will return the same results as one left running locally, even in the face of library updates, application updates, user setup updates.

To accomplish this you must administratively control what’s installed on the machine. Inside of a cluster, we focus on dynamically provisioning machines, making certain that we control every detail of how the machines are installed. So we go the whole way down to loading kernels and managing device drivers and up to the level of making sure the right versions of libraries are there.

One of the things we do within a cluster is we try to guarantee consistency — that an application you run on a remote machine will run exactly the same as it does on a local machine. It’s a lot easier to guarantee consistency within a cluster than it is over a Grid. When you have local-area high-bandwidth communication and have administrative control over the machines, you have a lot more opportunities for consistency. The challenges are the same for a Grid, but we were able to pick an easier set of problems to solve within a local cluster.

Q: What sorts of applications are better suited for a cluster than a Grid?

Becker: One of the reasons to select a cluster instead of a Grid is to run applications that require low-latency communication, and you’re pretty much constrained to do that on local machines. You can get very low-latency interconnects for clusters. That’s difficult to accomplish with a Grid. So there are some application characteristics that preclude them being run effectively over a Grid unless you’re doing it at the coarsest grain level.

One class of applications, “spectral methods,” requires all-to-all communication with each time step. And the length of the time steps often are determined by the latency of that communication, so many of these applications really only effectively run on machines that are local to each other.

Now, on the other side of that, a surprising amount of computation today is parametric execution. I think a decade ago, people didn’t really expect this. Today you have machines powerful enough to run pretty big simulations. What you need is to run hundreds of thousands of similar simulations, but with different input parameters. For workloads like that, wide schedulers – wide-area scheduling systems in Grids – are very effective. Of course these are also very easy tasks for clusters to do.