matt_prigge
Contributing Editor

Quick rules to avoid overbuying or underbuying storage

analysis
Nov 30, 20098 mins

Appropriately sizing a new primary storage environment can be tricky. Follow these guidelines to avoid doing it wrong

Choosing a primary storage solution for your organization can be a complicated task. Perhaps the most important thing you can do to ensure that you end up with a well-designed and cost-effective solution is to have a solid understanding of the storage needs of your environment — both present and future. If you fail to do this before you start to review the myriad storage solutions on the market today, you’ll waste a lot of time and may end up with a solution that doesn’t fit your needs.

Unless you are intentionally buying storage that will be dedicated to a single high-performance system, the best place to start is to look at literally every storage user in your environment. Highly redundant, high-performance primary storage is not cheap. So the best way to leverage the investment you’re going to make is to ensure that it delivers the maximum benefit to largest percentage of your infrastructure as possible.

Here are some quick rules to follow and common pitfalls to avoid in the key areas of consideration.

Capacity Your current storage capacity requirements are probably the easiest thing to determine. Look at the disk space used by all of your servers, add that up, and voilà — you have a number. But of course, it’s not that simple. There are three core factors to consider that can cause your storage estimate to be inaccurate.

The first and often hardest factor to estimate is data growth. Having a system in place that will monitor your data usage over time and allow you to extrapolate into the future is really the only way to accurately assess this need. Whether or not you’re currently considering buying storage, you should implement a system to extrapolate future needs. Knowing when you’re going to need to make your next storage investment well in advance will both allow you to budget more effectively and give you a chance to implement data growth control measures before it’s too late.

The second factor to consider is what level of overallocation you’re going to try to maintain. You’d be unlikely to allocate 105GB of storage for a volume containing 100GB of data — the chances are good that the data would grow and fill that empty space before you could react. If you’re coming from a predominantly direct-attached storage environment, chances are the overallocation in your environment is already extremely high. Adding internal storage to an in-place production server is generally something most people try to avoid, so it’s common to see a server purchased with two or three times as much storage as it needs. Not having to do this is one of the primary benefits of using a shared storage solution, but it’s important to consider that you’re probably still going to want to maintain at least 20 to 30 percent free space on every volume you have.

The third major capacity factor to consider is any planned usage of snapshots in the storage environment. Various storage vendors have implemented dramatically different snapshot technologies, so this is something you will need to revisit as you consider specific products. The amount of space that snapshots will use is typically related to the rate of data change on your storage volumes. Note that this not the same as data growth — data can and does frequently change without growing. Databases and e-mail servers are great examples of this type of turnover. A quick rule of thumb is to set aside between 50 and 100 percent of the actual size of your data for snapshots. How often you want to take snapshots, how long you want to keep them, the specific snapshot technology in use, and your data change rate will significantly affect this calculation.

Performance Judging the amount of performance that your disk architecture will need requires a good understanding of the various metrics used to judge disk performance. This topic can become extremely complex the deeper you delve into it, but there are some general rules that will usually allow you to make an accurate estimate of your needs.

Before you begin to consider the storage performance needs of your applications, the first thing you can do is to forget how important they are. That may sound ridiculous, but I have seen time and time again that organizations will unconsciously assume that because an application is “important,” it requires more performance. Or, worse, because an application is “unimportant,” they assume it will require less.

In the first case, you risk buying disk resources that you don’t need. In the second case, you may end up with a poorly performing application. Important or not, nobody likes either of these situations. An application’s relative importance within your organization may lead you to increase the amount of performance headroom you leave it to ensure you have room for unexpected demand spikes, but it shouldn’t influence your initial analysis.

In rough terms, there are two major performance metrics that will influence the type of disk you buy: data throughput and transactional throughput.

Data throughput “Data throughput” refers to the raw amount of data you can read or write to your storage in a given amount of time and is generally reflected in megabytes per second. This is a performance metric that many people are familiar with as it has direct analogs in the networking world. However, it is also often overemphasized and generally misunderstood.

In short, there are very few instances where raw data throughput actually ends up mattering a great deal. For example, I can construct a test that will sequentially read two megabyte chunks of data from a given SAN, and I may even be able to nearly max out the 4Gbps FibreChannel that I’ve attached to it with 400MBps of traffic. That’s great, but outside of video editing, medical imaging, and a few other niche fields, very few applications will ever come close to doing this in real life. The most common high-bandwidth use case that you’re likely to see in your environment is direct-from-SAN backup.

While it’s certainly important to ensure that your chosen storage platform can meet these needs, focusing only upon maximum data throughput misses one of the most critical aspects of disk performance.

Transactional throughput “Transactional throughput” refers to the total number of small, randomized disk transactions that your disk architecture can carry out in a given amount of time and is usually reflected in I/O per second (IOPS). In short, transactional performance is what will generally make or break your storage environment, not how much raw data you can move.

Most critical applications are based on structured data systems such as databases. These types of systems generally produce a significant amount of very small data transactions that are very rarely sequential. As such, your storage system’s memory-based cache is not likely to be particularly effective, nor will you be using the bandwidth available in your storage interconnect to a great degree. The key factor in providing transactional throughput is the number and type of disks you have in your storage system. In these cases, the amount of time that it takes a drive head to seek data on one part of a disk platter and then skip over to the other side and get another piece makes a much larger difference than whether you have 4Gbps or 8Gbps FibreChannel. This type of load is why 15,000-rpm serial-attached disks and solid-state drives (SSDs) exist.

Because of this, it’s absolutely critical to monitor the transactional disk load your applications exert on their storage. Make sure you perform the monitoring for long enough such that you see the spikes generated by transient events such as month-end processing and backups. These are the events that will really push your storage platform and will often define whether its implementation has been a success.

If you have never done this kind of monitoring in your environment before, there’s a good chance that your conceptions of an app’s importance will not match the amount of resources they consume. I’d need a few more hands to count the number of times that I’ve seen an e-mail server generate as much or more transactional disk load than a critical line-of-business application.

Putting it all together The bottom line is that the more information you collect about your storage environment before you go shopping, the better off you’ll be. Storage vendors will be able to more accurately specify storage that will do the job you need — and will less likely to try to sell you something you don’t need. You’ll be able to immediately disqualify products that don’t fit your needs and reach an accurate decision much more quickly.

Even if you’re not considering buying storage now or already have a centralized primary storage environment, consistently collecting this data and reporting on it is critical to ensuring that you are continuing to deliver the resources that your applications need.

This article, “Quick rules to avoid overbuying or underbuying storage,” originally appeared at InfoWorld.com. Follow the latest developments on data storage and storage management at InfoWorld.com.