matt_prigge
Contributing Editor

VMware vSphere storage performance tuning

analysis
Apr 25, 201110 mins

Improve storage performance and your understanding of vSphere's nuts and bolts by getting under the hood

If you’re like most people, you may think that spending time tuning your storage infrastructure is a luxury you can’t afford. In fact, it teaches you a great deal about what makes your storage tick. That knowledge can be invaluable when you plan for additions to your storage infrastructure — or get in a fight with a vendor over whether an application or your storage is to blame for poor performance.

Storage tuning is also one of the deeper topics you’ll find in IT today. There are a vast number of variables that can affect performance, many of which will be specific to the applications you run and the type of equipment you have fielded to run them. Tuning storage is by no means a one-size-fits-all process.

Yet some basic concepts can help anyone. Here, I’ll dig into some of the concepts for tuning the storage performance of VMware vSphere virtualization hosts.

Where to start

The first thing to do is to gain some general understanding of what your storage is being asked to do and how it can affect application performance. Though some applications rely on raw bulk data throughput, most live and die by the transactional performance they can wring out of your disk. Unfortunately, transactional performance — generally characterized by the amount of small, random reads and writes that can be completed in a given period of time — is also the hardest for your storage to keep up with.

Analyzing application performance requirements usually starts with a combined approach of monitoring what your applications do in production, then testing the limits of your existing storage configuration by running a thorough stress test. Once you’ve kicked the tires in this way, you can loop back as you make changes to gauge what effect they had. This is a very important point — many attempts at tuning for performance end up decreasing it, so it’s important to know your changes are having the desired effect.

Get the tools you’ll need

Though VMware has done a good job of exposing a lot of the inner workings of vSphere through the vSphere Client, much of the critical tuning work still needs to be done via the command line. If you’re running the fully installed version of vSphere/ESX, you already have a Linux service console you can SSH into and access the command line.

If you’re running the embedded ESXi version, this can be a bit trickier as the service console doesn’t strictly exist; there is a command line, but it’s really more suited to troubleshooting than day-to-day maintenance. It’s also worth noting that VMware vSphere 4.1 will probably be the last release in which a full-service console version is going to be an option; we’ll likely use ESXi from here on out.

Whichever version you run, I’d strongly recommend downloading and installing the VMA (vSphere Management Assistant). The VMA is a prebuilt, Linux-based virtual appliance already kitted out with all of the command-line tools you’ll need to manage any flavor of vSphere/ESX installation — embedded or not. Those tools include all of the configuration tools that will allow you to script just about any management you can possibly imagine and real-time monitoring tools (such as resxtop) you’ll need to gain extra visibility into your environment. There’s also a wealth of community-generated scripts you may find helpful.

For our purposes, these tools will allow you to make changes to the default storage behavior of vSphere in ways that you can’t (yet) do through the GUI. No matter what storage back end you run, there will almost always be a tuning procedure where it’s required, so it’s a good place to start.

Do the reading

Before you jump in and start making changes, it’s important to crack the books and do some reading. Yes, I know, nobody reads manuals anymore, but this is one time you need to make an exception. One good place to start is the Performance Best Practices doc that VMware publishes; with the guide for vSphere 4.1, the storage material starts on page 25.

Pretty much every type of storage will have very specific ways in which you can improve performance through vSphere, and almost all of them are different. Worse still, these differences can even exist between different firmware revisions on the exact same hardware.

Though the quality of the documentation available may vary from one storage vendor to another, what you’re looking for is a best practices document that lays out the vendor’s preferred means of configuring vSphere for running with their hardware (check out the detailed example that covers the HP Enterprise Virtual Array series). If you can’t find a similar doc for your hardware, the easiest thing to do is open up a ticket with support and ask for recommendations.

However you get the info from your storage vendor, be absolutely sure it applies to the version of vSphere/ESX you’re running. The storage layer has changed immensely through the last few major and minor versions. For example, the optimal configuration for connecting to most iSCSI SANs has almost entirely changed in between each version of ESX 3.0, ESX 3.5, vSphere 4.0, and vSphere 4.1. The Fibre Channel changes have been a little less drastic, but still very important, especially between ESX 3.x and vSphere 4.x. No matter what you’re running, make sure the documentation and recommendations match.

Knobs you can turn

As I alluded to earlier, a mass of performance-related storage variables can be changed in vSphere. Simply understanding what some of them do is tricky enough; knowing how they’ll impact the performance of the specific applications in your environment is considerably more complicated.

A good example of this is the Max I/O Size variable set within vSphere. By default, this is designated as 32MB, meaning vSphere can issue a single 32MB I/O command to storage without splitting it into multiple requests. Most arrays deal with this well, and it has the effect of decreasing CPU loading on the host and increases overall throughput. However, this default setting can dramatically decrease performance on some arrays and should be changed in those situations. This is one reason why the reading I talked about earlier is so important — an issue like this could be biting you right now and you might not know it.

However, it’s good to be familiar with some broad concepts:

Multi-Path I/O

MPIO (Multi-Path I/O) can have an enormous impact on storage performance and reliability. If you minimally configure vSphere simply to connect to your storage and do nothing else, most configurations will not immediately make use of MPIO except perhaps as a means to fail over in the event that one of your HBAs or NICs dies or loses connectivity to your storage. Getting vSphere to take complete advantage of multiple HBAs or NICs requires extra configuration.

VMware vSphere 4.1’s storage layer includes a few MPIO-related components that are important to understand. The overarching architecture is called the PSA (VMware Pluggable Storage Architecture). That architecture encompasses the default NMP (Native Multipathing Plug-in). Within the NMP, VMware has implemented several PSPs (Path Selection Policies) and SATPs (Storage Array Type Plug-in), which control how vSphere makes use of multiple fabric pathways to your storage, either FC or iSCSI.

Much of the MPIO-related tweaking you’ll do will revolve around influencing the behavior of these components, either by modifying which PSP/SATP you use (enabling the round-robin PSP, for example) or by implementing a plug-in that your storage vendor has written specifically for your array.

If your vendor has made a plug-in available for your array, you should strongly consider installing it. These plug-ins are sometimes called MPPs (MultiPathing Plug-in), DSMs (Device Specific Modules), or MEMs (Multipath Extension Modules). They take advantage of the new plug-in architecture available in vSphere 4.1, allowing VMware to intelligently direct traffic to your storage in the most efficient way possible. Usually, this involves implementation of I/O queuing mechanisms that allow vSphere to direct I/O to the ports on your storage controllers that are most advantageous, either the least busy controller port or to a port on a controller that “owns” the particular LUN the traffic is destined for. It’s worth noting that this plug-in architecture is only available for use in vSphere Enterprise or better.

If you don’t have Enterprise or your vendor hasn’t published a DSM, that doesn’t mean you can’t take advantage of MPIO. For example, the HP EVA I mentioned earlier is ALUA-compliant (Asymmetric Logical Unit Access), so it can inform vSphere’s built-in ALUA-aware SATP of which paths are the most advantageous to use. Many other arrays work similarly.

vStorage APIs for Array Integration

VMware VAAI (vStorage APIs for Array Integration) is a set of APIs that implement a few new SCSI commands that allow vSphere to off-load some storage tasks such as virtual machine cloning and thin provisioning directly onto the storage array. If you routinely deploy virtual machines (in a VDI environment, for example) or perform a lot of LUN rebalancing, this feature will substantially improve performance. Essentially, VAAI is an example of intelligent integration, allowing the storage to perform tasks internally without needing the host’s involvement.

VAAI isn’t yet supported by all arrays. Some have yet to get around to releasing firmware updates to support it, while others are waiting for the SCSI extensions used in VAAI to become SCSI T10 ratified standards; most of the majors hope to get them out the door by Q4 of this year. Additionally, you’ll need vSphere Enterprise edition or better to even be able to use it.

Storage I/O Control

VMware SIOC (Storage I/O Control) is another feature that was implemented in VMware vSphere 4.1. Essentially, it allows you to specify a maximum storage latency over which vSphere will start throttling I/O sent to the array. This can prevent the array from being saturated with traffic; there’s usually a point beyond which arrays will become dramatically less efficient the more load you place on them. You can also set limits and preferences on individual VMs to allow the most critical VMs to be given the biggest slice of the disk performance pie — nearly impossible to do prior to SIOC. As with VAAI, this feature is license-limited and present in only vSphere Enterprise Plus.

Putting it all together

No matter how busy you are, try to make some time to get under the hood of your storage environment. At first, shaving a few percentage points from storage response times might not seem like it’s worth the effort, but the lessons you’ll learn about your environment will help you immensely down the road.

This article, “VMware vSphere storage performance tuning,” originally appeared at InfoWorld.com. Read more of Matt Prigge’s Information Overload blog and follow the latest developments in storage at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter.