matt_prigge
Contributing Editor

How to stress-test primary storage

analysis
Nov 29, 201011 mins

You've unpacked your eval unit -- now here's how to put together a test plan and kick it around the block before you buy

The performance of primary storage is more likely to affect the performance of your applications than the network, server CPUs, or the size and speed of server memory. That’s because storage tends to be the biggest performance bottleneck, especially for applications that rely heavily on a large databases.

That makes testing crucial. Before you buy, you need to know how well your applications perform on the specific storage hardware you’re eyeing. As I noted last week, most vendors will provide a loaner for you to test-drive.

Unfortunately, testing storage is not always a straightforward process. It requires a solid understanding of how your applications use storage and how the storage device you’re evaluating functions under the hood. Each situation is different; no single test or benchmark can give everyone the answer they’re looking for, but you can take some basic evaluative steps to ensure your storage is up to the task.

Knowing what to test

The tests you run on your prospective storage hardware will largely depend upon what you’re doing with it. Someone in search of storage for a video editing suite, for example, will have drastically different storage needs than someone who runs a large enterprise database. These tests fall into two familiar categories: throughput and random seek.

Raw data throughput is simply the amount of data you can move on or off storage hardware in a given period of time, usually expressed in MBps. Unfortunately for most enterprise applications, this figure is relatively meaningless. It also happens to be the most frequently quoted performance metric for marketing purposes, as well as the easiest to test and the easiest for hardware to excel at. It’s no wonder throughput numbers lead to misconceptions about storage performance.

High levels of raw throughput accelerate the transfer of very large files, but most applications rarely — if ever — incur this kind of disk workload. Nonetheless, raw throughput tests can be very useful for validating the implementation of your storage network, whether Fibre Channel or iSCSI, though it does little to stress the disk subsystem itself.

Almost always, the more important metric is the number of small, randomized disk transactions that can be completed in a given period of time, expressed in I/O operations per second or IOPS. This is the kind of workload most databases and email servers put on your disk. Poor transactional disk performance is the usual underlying reason for poor databases performance, second only to poorly conceived application and database design — but that’s another story. The number of disk transactions that can be completed per second is a function of latency — that is, the time a disk resource requires to serve each request.

Drilling into transactional performance

The type of disk you use and the intelligence of the array determine the transactional disk performance you can expect. The small and randomized nature of these workloads put stress on storage hardware, because read/write heads run into physical limitations as they jump all over spinning mechanical disks.

This constraint is one of the main reasons SSD (solid-state disk) arrays are becoming more popular for transaction-intensive applications. However, SSDs cost so much more than conventional disk that their costs can only be justified for high-end applications.

For the most part, we’re stuck with disk heads zipping around various physical points on disk platters, working with very small chunks of data over and over. Worse, writes tend to take longer than reads, so write-heavy loads can really whack performance. To give you an idea of how big of a deal this is, I ran a brief, relatively unscientific test on my poor three-year-old laptop.

First, I configured a test that would sequentially read 4KB chunks from a 1GB file. It was able to do this about 3,560 times a second (3,560 IOPS) with an average latency of about 0.25 millisecond per transaction. Pretty good, right? Not so fast.

I reconfigured it to perform that same test, but 30 percent of the time, it would write to the disk instead of reading from it. Now, I only get about 750 disk transactions per second with an average latency of about 1.3 milliseconds per transaction. Worse, but not as bad as it’s going to get: I reconfigured it to perform that same 70/30 percent read/write split entirely randomly within that 1GB file, so the disk head would seek all over the place. The result: 80 IOPS and an average latency of about 12 milliseconds. That’s nearly 45 times fewer transactions and 48 times more latency per transaction than the sequential read-only test.

There’s the rub: Most database platforms create storage workloads that resemble the most challenging of those three tests. To improve the performance of disk subsystems under these types of loads, the most common approach is to spread the work across many physical disks in an array. Generally speaking, the more spindles dedicated to a workload, the faster it will perform.

Storage arrays typically implement memory caching that stores reads and accepts writes before they are committed to disk. The write cache is critical, because it allows the disk subsystem to absorb momentary spikes in data being written until the physical disks can catch up. That won’t save you, however, if your write load consistently comes in faster than the physical disks can accept from the cache — the cache will saturate and you’ll be waiting for the disks either way.

Some of the more intelligent read caches are great at dealing with sequential workloads: They detect a sequential read pattern and start asking for blocks of disk before the data is read (often called “read ahead”). That way, the data is already in cache by the time it’s requested and there’s little or no delay. Read caches are not particularly effective, though, when data is read randomly from a very large data set (particularly one larger than the cache).

Testing tools

There are tons of ways to test your storage. They run the gamut from simply setting up the application you intend to run and artificially creating a load on it all the way to using tools specifically designed for disk testing. Here are three that I use frequently.

Iometer. Iometer is a graphical block-level disk I/O testing tool that was originally developed by Intel and subsequently open-sourced. It hasn’t seen much development work over the past few years, but this is less of a function of it being old software and more of a case of it being damn good at what it does. Iometer is available for both Windows and Linux.

Iometer allows you to construct fairly complicated tests involving multiple worker processes all simulating different kinds of I/O workloads at the same time. You can even link worker processes on different servers on a network to a single management console to thoroughly test shared SAN storage.

For example, you can configure an eight-threaded test profile, one of which would perform sequential 64KB writes to one LUN while the other seven perform randomized 4KB reads and writes on a different LUN (essentially mimicking the transaction log and database workloads of Microsoft Exchange). Further, you can deploy that same configuration on several servers at once and sum the results from all of them in the same management console — very useful.

Bonnie++. Bonnie++ is an incredibly simple Unix-based tool originally developed as simply “Bonnie” by Tim Bray and then rewritten in C++ and heavily extended by Russell Coker. Bonnie++ can do many of the same tests that Iometer can, but Bonnie++’s differentiating factor is that it can simulate a file system load as well as a simple block-level disk load.

On Linux/BSD platforms, you have lots of choices about what kinds of file systems to use, ranging from the Linux EXT3 file system through ReiserFS and various ZFS implementations. Each has its own pluses and minuses, and which you use is determined by what kinds of files you’re dealing with (creating lots of tiny files, fewer huge files, and so on). 

Microsoft Jetstress 2010. Microsoft’s Jetstress 2010 is a great example of a purpose-specific disk testing tool. Instead of generating random disk I/O as Iometer does, Jetstress actually implements a realistic approximation of the database back end of a Microsoft Exchange 2010 mailbox server and applies a realistic load against it. Instead of telling you how many IOPS you did and leaving it up to you to extrapolate how that will impact your application performance, Jetstress will tell you how many Exchange users of a given usage profile you can expect to support with the storage you have. Even if you’re not running Microsoft Exchange, this can be an exceptionally good way of kicking your storage around with a realistic application usage pattern.

Things to do while you test

Whatever testing mechanism you use, there are a few things you should do while you’re testing. These changes can shed light on huge architectural differences between disk devices that otherwise appear to perform roughly similarly to one another.

Take snapshots. If your storage device supports snapshots, take some. In fact, take a bunch — as many as you can ever see yourself using. See if that affects the overall performance (specifically, the small-block transactional write performance). Many storage arrays that support snapshots use what’s called a copy-on-write snapshot algorithm. This means that a separate area of the available disk resources is set aside to store changed data. Anytime you write to a volume while a snapshot is in place, the array must first read the data that will be replaced, write it into the snapshot area, then overwrite the original data with the new data. This can substantially degrade write performance in some circumstances.

Use a very large test file. Unless you’re working with very small databases and/or files, it’s best to use a test file size that is significantly larger than whatever cache might be present on your storage array or array controller. This will allow your testing to fully circumvent most of the benefit that the cache would have granted and gives you a good worst-case idea of how well the disks will perform.

Degrade the array. If you’re testing a disk array that uses some form of RAID (I hope you are), try yanking out a disk or two to degrade the array and trigger an array reconstruction onto a hot spare. Array leveling and reconstruction events can substantially degrade overall I/O performance. This is a very good thing to know if your applications are sensitive enough to disk performance bottlenecks that they will bog down when performance drops temporarily. The whole idea of RAID is to allow you to continue unscathed when you lose a disk. It’s important to know what the performance cost of losing a disk might be, so you can provision extra storage performance headroom to accommodate it if need be.

Allow the array to fill. Some virtualized SAN platforms use internal disk allocation mechanisms that depend on a certain amount of free array capacity to maintain high levels of performance, often requiring that you keep at least 10 percent of the array empty. To see the repercussions of failing to do this, allow your disk array to fill to the point where it is almost entirely allocated and observe the effects. If your disk array suffers a performance penalty when it’s full, then you know to avoid that situation at all costs.

Test, test, and test again

Like any other area of technology, storage is full of word-of-mouth lore: That storage subsystem is superfast, the other one isn’t worth the premium price, and so on. Received opinion is one thing, but testing with actual prospective hardware for your own data center is quite another. It’s the only way to get a feel for how your storage subsystem will perform with heavy loads, ballooning data, or disk failures. It also forces you to learn more about how your applications and storage actually work — something that nearly everyone will benefit from. When something inevitably goes wrong, you’ll have a much better feel for what might be causing it and how you can fix the problem.