Building cloud-ready, multicore-friendly applications, Part 1: Design principles

Prepare your applications for the future of distributed computing

Multicore processing power and cloud computing are two of the most exciting challenges facing software developers today. Multiple chips or processing cores will enable individual computing platforms to process threads unbelievably fast, and the advent of cloud computing means that your applications could run on multiple distributed systems. In this first half of a two-part article, Appistry engineer Guerry Semones gets you started with the four design principles for writing cloud-ready, multicore friendly code: atomicity, statelessness, idempotence, and parallelism. Level: Intermediate

With the advent of cloud computing, the explosive growth of mobile device use, and the growing availability of multicore CPUs, there is a drive toward new models for designing and developing code. Well, they’re not new, really. Some of the models have been around, and in use, for some time, but they’re not typically practiced by mainstream developers … yet.

Michael Groner wrote about this trend in a blog post entitled “Microsoft says you need to change how you are building your applications.” Here is part of what he said:

I was surprised how many speakers [at Microsoft TechEd 2008] were conveying the same message: CPU speeds are topping out. If you want your applications to run faster and better you are going to have to build your applications in a new way. The solution isn’t just to learn how to multi-thread your applications. The solution lies in building your applications into smaller units of code called tasks that can be moved around to the different cores of a multicore machine.

The message that the time had come to “take programs and break them down into parallel execution units” was music to Michael’s ears.

Why? It is the same model that we [at Appistry] have been using for fabric computing since our beginnings in 2001. Now we are starting to see this message more and more as cloud computing is gaining acceptance.

For your applications to meet the demands of computing in a cloud-based, multicore world, you’ll need to design your code with the following attributes in mind:

Atomicity
Statelessness
Idempotence
Parallelism

In this article, the first of two parts, I’ll discuss each attribute in turn and then explain what it means in practice, when you write code. In the next article you’ll get a glimpse of how applications written to maximize these four attributes can reap the benefits of various multi-processor, distributed architectures.

Atomicity

An atomic piece of code has a specific and clearly defined purpose. In object-oriented terminology, it has cohesion. Atomic code adheres to Robert Martin’s single responsibility principle, where not only classes but methods have distinct jobs, each method having a single concern. Such code is like a delicious, pepperoni pizza, not the garbage pizza loaded with everything. Additionally, atomic code does not rely on a specific order of execution.

Don’t lean on me

While atomic code may require other libraries, its execution is self-contained — that is, it contains no call-level interdependencies. If there is interdependency across individual method calls, then your individual methods are not atomic, though they may form an atomic operation all together. Consider the simple example in Listing 1. A call on the setRadius() method must precede a call on any of the computational methods, such as getDiameter(). Therefore, setRadius() and getDiameter() are not atomic separately. Though this is fine from a purely object-oriented design standpoint, it has implications in distributed and parallel environments. The computational methods in Listing 1 cannot be processed in parallel across distributed workers or cores independent of setRadius().

Listing 1. A simple class with call-level interdependencies

package com.appistry.samples;

public class NonAtomicCircle {

    private int radius;

    public void setRadius(int radius) {
        this.radius = radius;
    }

    // setRadius must be called before any of the following

    public float getArea() {
        return 3.14f * (radius * radius);
    }

    public int getDiameter() {
        return 2 * radius;
    }

    public float getCircumference() {
        return 3.14f * getDiameter();
    }
}

The code in Listing 2, though more static in nature, does not have this dependency or its associated restrictions, and can be executed independently. In more complex classes involving equally complex state, this can become a critical design consideration.

Listing 2. A simple class with atomic methods

package com.appistry.samples;

public class AtomicCircle {

    private int radius;

    public void setRadius(int radius) {
        this.radius = radius;
    }

    // These computational methods can called independently
    // and distributed or run in parallel.

    public static float getArea(int radius) {
        return 3.14f * (radius * radius);
    }

    public static int getDiameter(int radius) {
        return 2 * radius;
    }

    public static float getCircumference(int radius) {
        return 3.14f * getDiameter(radius);
    }
}

It’s simple, my dear Watson

Atomic code is also concise by its nature, and, as stated above, has a specific, clearly defined purpose. Fat, hairy, do-multiple-things methods like provisionLineItem() in Listing 3 are likely not atomic (though the methods it delegates to are candidates). If a method serves multiple purposes, then you should break those purposes up into separate atomic methods, and likely into separate classes, as reserveInventory() and calculateWeight() sound like separate concerns.

Listing 3. A non-atomic method with too many responsibilities

package com.appistry.samples;

public class OverAchiever {

    public void provisionLineItem(Shipment shipment, String sku, int quantity) {
        Product item = reserveInventory(sku, quantity);
        float weight = calculateWeight(item, quantity);
        shipment.add(item, quantity, weight);
        reportPopularityToCorporate(sku, quantity);
    }
    
}

The same holds true for long-running methods: they are not usually atomic. Long-running code may have a single purpose, but if you can break the method down into more atomic steps, then you gain flexibility when running that code in cloud environments or on multiple cores.

Flexibility and control

The choice to break your code down into more atomic steps is all about flexibility and control. Whether your code is executing across multiple cores on multiple threads or processes, or across multiple servers in a distributed computing environment, you can gain more control over how it executes and make that execution more flexible if you break the code down into atomic steps.

For example, a general rule of thumb is that your code ought to be able to execute anywhere in a distributed computing environment at any given time. If setRadius() in Listing 1 were called on an instance of the object NonAtomicCircle on one server, and then getDiameter() were called on a different instance of NonAtomicCircle on another server, then obviously any interdependency could not be easily satisfied without additional overhead and effort. (You’ll see an example of such overhead later in this article’s discussion of statefulness versus statelessness.) Such overhead, if not handled correctly, can inhibit scalability and availability.

In addition, some distributed computing environments plan for failure — indeed, they expect failure to arise at any time when code is executing. Consider the provisionLineItem() method of the OverAchiever class in Listing 3. If a power supply takes down a server when it’s in the process of executing the provisionLineItem() method (which does multiple things), and your distributed environment allows for automatic retries of failed method calls, what are the implications? How far did the provisionLineItem() method get in its work? The reserveInventory() step? Or the calculateWeight() call? Which steps were satisfied, and what was left undone?

Atomic steps allow more fine-grained levels of reliability and problem resolution. Typically, distributed environments use compensating tasks to check failed calls and make corrections if needed. If reserveInventory() is called as an atomic method call and is interrupted, you know exactly where you were, and what needs to be checked and corrected. However, when it is called within an aggregate, coarse-grained method like provisionLineItem(), then you’re making failure resolution more difficult. Figure 1 shows a sample orchestration of atomic calls being handled reliably during failure in a cloud application platform environment.

Figure 1. Atomic calls being handled reliably during failure in a cloud application platform environment. (Click to enlarge.)

Long-running code only makes this situation worse. Perhaps the long-running code has a single purpose; but if it runs for a long time to achieve a single end, and the server running it dies ninety percent of the way through the job, do you retry it from the beginning? Starting over is fine for some purposes, but in time-critical work, it would be unacceptable. If a long-running method can be broken down into more atomic steps, then your distributed application can snapshot progress along the way, and retry from failure mid-process more efficiently.

The essential value of atomicity in parallelism and multicore computing

In terms of parallelism and multicore, Daniel Spiewak wrote an excellent entry on this very topic on his blog, Code Commit. Daniel argues that

[t]here is actually a deeper question underlying concurrency: what operations do not depend upon each other in a sequential fashion? As soon as we identify these critical operations, we’re one step closer to being able to effectively optimize a particular algorithm with respect to asynchronous processing.

Later in his post, he states:

This is truly the defining factor of atomic computations: it may be possible to reorder a series of atomic computations, but such a reordering cannot affect the internals of these computations. Within the “atom,” the order is fixed.
So what does reordering have to do with concurrency? Everything, as it turns out. In order to implement an asynchronous algorithm, it is necessary to identify the parts of the algorithm which can be executed in parallel. In order for one computation to be executed concurrently with another, neither must rely upon the other being at any particular stage in its evaluation.

All of that should sound familiar at this point, and that serves as a good stopping point on this topic.

Statelessness

A friend of mine used to say, “All generalizations are false, including this one.” Generally, the rule of thumb in distributed computing is that stateful objects are bad, and stateless is good. There are a number of advantages to being stateless on the cloud and in concurrent programming, and you’ll see them in detail in this section.

In my discussion of atomicity above, I said that distributing or parallelizing two non-atomic calls, where method A changes state that method B needs, introduces overhead; I noted that if this isn’t handled correctly, it can inhibit scalability and availability. Now I’ll show you in more detail what scalability and availability mean in the context of statelessness versus statefulness, and add load balancing, reliability, and concurrency safety to the discussion.

Stateless vs. stateful

First, some definitions are in order. A stateless object does not hold information or context between calls on that object. In other words, each call on that object stands alone and does not rely on prior calls as part of an ongoing conversation. A caller could call a method on one instance of a stateless object, and then make a call on a different instance of the same object, and not be able to tell the difference, as illustrated in Figure 2.

Figure 2. A client calls multiple instances of a stateless object without affecting the outcome of the calls. (Click to enlarge.)

This does not mean that a stateless object does not deal with data or context; rather, such an object simply does not hold that data or context across multiple method calls, and doesn’t access it concurrent to other processes, be they distributed processes or local threads or processes. Consequently, multiple callers may access any instance of a stateless object and have their call satisfied, as illustrated in Figure 3.

Figure 3. Multiple clients concurrently call distributed instances of a stateless object. (Click to enlarge.)

Statefulness implies certain conditions. Take, for example, a stateful shopping cart object. Each instance of a stateful shopping cart object might represent an individual customer. Therefore, if you had a thousand customers shopping on a site, there would be, in theory, a thousand instances of the stateful shopping cart object. To update the state of a given customer’s shopping cart, you must find the correct object for that customer and make calls on it. This is the Neo the One version of statefulness. Only one object can do the job, and Morpheus must find it … somewhere in the Matrix.

Statefulness may also imply shared state. Shared state may be as simple as an object that is generally stateless in terms of member data (per instance values) and yet can hold onto static data (per class values) that changes from method call to method call, and affects the outcome of subsequent method calls. The static data injects its importance into the conversation between the callers and the object. Shared state gives rise to increased complexity, whether shared across a cloud of computing resources or across local threads or processes.

Therefore, a general definition of statefulness is in order: a stateful object holds onto data, information, or context that is important across the conversation of multiple calls on a given instance of that object (as illustrated in Figure 4), and multiple callers may need concurrent access to that state (as illustrated in Figure 5).

Figure 4. A calling client returning to the same object instance as part of an ongoing stateful conversation. (Click to enlarge.)

Figure 5. Multiple clients concurrently calling a single, stateful instance of an object. (Click to enlarge.)

If an object is stateless, the cloud or the multicore environment is free to use any instance of that object to get its work done (as in Figures 2 and 3). Therefore, any instance of the called object can be Neo. From a design standpoint, this affects scalability, availability, load balancing, and reliability.

Statelessness and scalability

The use of stateless objects can help scale an application in a distributed computing environment like the cloud because work can be load balanced onto whatever machine is most suited to do it. You don’t have to worry about contention for a single instance of an object (as in Figure 4), a situation that could potentially lead to race conditions, deadlock, and starvation; instead, you can just add more machines and let the work scale out horizontally onto them. If you are dependent on a single, golden instance of a stateful object, and that instance may be wanted by many concurrent calls, then, depending on the stateful model you’re using, you may be creating a scalability bottleneck. However, no matter what stateful model you use, you are still introducing more overhead than you would see in a stateless model.

Statelessness and availability

If your cloud environment is free to use any machine to get its work done, then you have very widespread availability (as in Figure 3). In case of failure, the loss of any machine does not affect you, provided you have enough machines to continue; this is illustrated in Figure 6.

Figure 6. Stateless objects promote availability and reliability. (Click to enlarge.)

Again, if you are dependent on a single, golden instance of a stateful object, and the resources holding that instance go down, you have now lost availability for that instance of that object, and have to reconstitute it elsewhere, if possible, hopefully without loss of state; this is illustrated in Figure 7. At a minimum, you have introduced complexity into your application.

Figure 7. Stateful objects reduce availability, and may reduce reliability. (Click to enlarge.)

Statelessness and load balancing

If the cloud environment is free to use any machine to get work done, then whatever machine has the least overhead can be given the work. (This is illustrated in Figure 3.) If there is a single, golden instance of an object on a single machine, and it is getting all the attention from calling clients, then you may have both a load balancing problem and a resource contention problem (as shown in Figure 5). The model of statefulness you use will determine the severity and complexity of the problem, and determine whether or not you can spread the work to multiple machines sharing a cache of the same stateful instance of the object.

Statelessness and reliability

The stateless model also bolsters a cousin of availability: reliability. Statelessness, coupled with a self-healing cloud environment, affords a lot of flexibility. If a machine goes down in such an environment due to hardware failure, the work can continue with another instance of the object on another machine (as shown in Figure 6). You wouldn’t have to worry about loss of consistency or other complexity as you would if you had depended on a stateful object that has been lost.

Statelessness and concurrency safety

Stateful code in any form in a distributed or parallel environment requires being mindful of concurrency issues. Stateless code, on the other hand, is more likely (no guarantees, of course) to be concurrency safe as you execute it across distributed nodes, or across multiple threads and cores. Stateless code is also in most cases easier to execute across multiple processes instead of multiple threads, a model seen in Google’s Chrome and Microsoft’s Internet Explorer 8 browsers, where each tab is a process and not a thread.

Concurrency safety is a complex topic; rather than rambling on myself, I’ll point you to a great article by Brian Goetz. According to Goetz, “many Web applications that use HttpSession for mutable data (such as JavaBeans classes) do so with insufficient coordination, exposing themselves to a host of potential concurrency hazards.” There’s lots of that kind of code already out there, and it just gets more complex in distributed and multicore parallel applications.

Idempotence

If code is atomic and stateless, it may take on other attributes, like idempotence. If a method is idempotent, you can execute it repeatedly and get the same outcome without adversely affecting anything else, like state.

Wikipedia defines idempotence in computing thusly:

In computer science, the term idempotent is used to describe methods or subroutine calls that can safely be called multiple times, as invoking the procedure a single time or multiple times results in the system maintaining the same state; i.e., after the method call all variables have the same value as they did before.
Example: Looking up some customer’s name and address in a database is typically idempotent, since this will not cause the database to change. Placing an order for a car for the customer is not idempotent, since running the method/call several times will lead to several orders being placed, and therefore the state of the database being changed to reflect this.
In Event Stream Processing, idempotence refers to the ability of a system to produce the same outcome, even if an event or message is received more than once.

Idempotence in a method or function is not always possible. While some logic lends itself naturally to idempotence, other logic may require safeguards in its design to assure safe repeatability.

If your code can be made idempotent, you’ll have a lot of flexibility in executing that code across a distributed environment like the cloud. By their natures, some distributed environments expect failure and are designed to work around it. Consider how the Internet routes around failure. If I request a Web page through my browser, and the request fails due to a backhoe cutting a trunk line in Des Moines, I can hit reload and my second request will likely take a different route to get the page. I could probably request that page repeatedly and not affect anything (though I may be driving up someone’s ad-based revenue). That’s idempotence in action in a distributed environment where failure happens and the environment deals with it.

Building idempotent code

In Listing 4, you have an example of a class that accumulates the total tax on an order. Each time addToTaxTotal() is called, the member variable totalTax is destructively incremented. Though the class handily wraps up the tax value, there is no way to get back to a prior value if an error condition occurs. Calling addToTaxTotal() multiple times with the same inputs will yield different results because the method does not leave memory in the same state. Therefore, by definition, this method is not idempotent. (Ignore for the moment the fact that there is a dependence on state here, since those arguments were covered earlier.)

Listing 4. A non-idempotent tax calculation class

package com.appistry.samples;

public class NonIdempotentTaxCalculator {

    private float totalTax;

    public void addToTaxTotal(Item item, String zip) {
        totalTax += lookupTax(item.getValue(), zip);
    }
}

In Listing 5, you have an idempotent class that, given the same inputs, always returns the same results. The IdempotentTaxCalculator takes an array of order items, and iterates over them using only local variables on the stack. No matter how many times calculateTax() is called, it will always return the same results given the same inputs.

Listing 5. An idempotent tax calculation class

package com.appistry.samples;

public class IdempotentTaxCalculator {

    public float calculateTax(Item[] items, String zip) {
        float tempTax = 0.0f;
        for (Item item : items) {
            tempTax = lookupTax(item.getValue(), zip);
        }
        return tempTax;
    }
}

But what if you need to change state? How can you make a method that changes state behave idempotently?

First off, just saying the word state implies that you’re keeping information somewhere. It might be in a shared database, a distributed cache, or a single, golden object somewhere in your system. If you have such state, then you can leverage it to your advantage. Listing 6 is somewhat oversimplified, but it conveys the general idea: design so that multiple calls can be absorbed without side effects. This concept is used in distributed messaging systems, and in any system where multiple calls for the same operation may arrive, but only one outcome is desired. The class in Listing 6 guards against repeatedly processing the same incoming order message by using the available state of order messages in queue.

Listing 6. Using state to enable a stateful object’s methods to be idempotent

package com.appistry.samples;

public class OrderProcessorService {

    public synchronized void processOrder(Order order) {
        if ( orderQueued(order) )
            return;

        enqueueOrder(order);
    }
}

Benefits

Many of the arguments for the value of idempotence are the same as those you’ve previously seen for atomic, cohesive, and stateless code running in distributed environments and on multi-threaded, multicore designs. Idempotent code’s repeatable and non-destructive nature improves availability and makes load balancing easier. You can run the code on any thread or any node and not worry about side effects.

Reliability in a distributed computing environment like the cloud sometimes means that an execution step may be interrupted by hardware failure. In that event, the distributed environment may decide to repeat the execution step elsewhere. The repeatability of idempotent code is ideal in these circumstances because the method call can be repeated without worry and without dependencies.

Parallelism

“Parallel code?” you may ask. “I don’t need no stinkin’ parallel code!” In the past, writing parallel code was specialized discipline, and most of us did not have parallel processing hardware in our basements.

However, most new workstations, servers, and laptops today have multicore CPUs. If you do have a dual core, quad core, or better setup and your software doesn’t make use of it, then you are literally leaving computing power lying on your desk, your rack, or your lap.

The industry is quickly moving to tools and languages that make writing and running code in parallel across multiple cores easier, if not brain-dead simple. Languages are getting new constructs to aid multicore development, some as simple as looping directives that execute loops in parallel instead of serially. Functional languages in particular, like Erlang, Scala, and F#, are looked upon favorably because they lend themselves to parallel logic. For example, such languages have immutable objects by default, which enforces data safety across threads. Functional languages also encourage the other base principles that you’ve already seen discussed: atomicity, statelessness, and idempotence. Similar constructs are appearing in languages and frameworks that make parallel execution across distributed environments simpler too.

That said, most if not all distributed cloud-based environments allow you to run code in parallel today without waiting for special language constructs in Java (or whatever your own language du jour is). Indeed, due to their underlying execution models, some of these distributed environments also automatically take advantage of multiple cores on each compute node in an efficient way. And you don’t have to adapt your code to get these benefits. This allows you to take your plain old Java objects and enable parallel execution across distributed computers while utilizing multiple cores.

Perhaps the largest challenge some face is choosing what code can be run in parallel. Again, tools are coming that will help in this respect. Some just identify the code sections for you; others will try to do it all automatically. However, whether or not you’re aided by such tools, the code design principles you’ve seen in this article become more powerful when utilized together, and go a long way toward making your code into good parallel-ready code.

The future runs on multiple processors

As you’ve walked through the design principles that will make your code cloud ready and multicore friendly, you’ve also gotten glimpses of the benefits that cloud computing platforms can bring to that code. In the second half of this article I’ll explain what exactly we mean, when we talk about cloud computing. Find out how a distributed, cloud platform can bring scalability, reliability, and availability to your code with little or no intrusion.

Guerry A. Semones is a founding senior engineer and product manager at Appistry, a pioneer and leading provider of next-generation cloud application platforms. Guerry also serves as liaison to the Appistry Peer2Peer developer community. Find out more by reading his blog.

ConcurrencyJavaWeb DevelopmentOpen SourceSoftware DevelopmentLibraries and FrameworksDesign PatternsCloud Computing

Topics

About

Policies

Our Network

More