Modernizing the storage stack

As technologies like replication and RAID trickle down storage tiers, storage professionals gain broader options and greater control

Not that long ago, properly aligned storage tiers were created by hand, with fixed purposes for fixed applications. This is changing rapidly, as new methods of ensuring data availability, speed, and redundancy are emerging.

Automated tiering has only recently become possible due to advances in processing power, storage types, and bandwidth growth. Add in the cloud, and the situation grows even more fluid.

In this week’s New Tech Forum, Dick Benton, principal consultant at GlassHouse Technologies, takes us through the quickly evolving world of storage tiering — from where we are today to where we’re going across all types of storage media. — Paul Venezia

The sea change in storage tiering philosophy

In the early days of storage tiering, tiers were developed primarily according to key cost differentiators, the first being the cost of additional storage needed to support data protection. That protection was provided by replication, snapshots, backup copies, and RAID configuration, and it was generally assumed that an application requiring tight and expensive recovery would also require good performance.

Vendors cooperated by ensuring that their enterprise-class offerings all supported synchronous, if not asynchronous, replication, along with lots of ports and performance characteristics. Tiers were created where typically the highest and most expensive ones supported a near-zero data loss, with return to operation well under 24 hours. Lower tiers supported increasing data loss tolerance and lengthier recovery times. Many organizations had one set of data loss tolerances and return-to-operation targets for localized partial failures, as well as a different and often somewhat looser set of attributes for recovery in the event of an entire site failure (aka disaster). Then four things happened.

A shake-up in storage tiering

First, the sophisticated replication and snapshot technology, heretofore only available on expensive, enterprise-class storage frames, gradually began to appear in midlevel and low-end frames. Today, there is probably not a storage array on the market that does not support replication and snapshots.

The second game changer was the new capability that allowed data to be striped across the entire array instead of over just the few disks in a SCSI tray. This capability put the importance of RAID configurations on the back burner, as RAID 10 performance could now be achieved through striping across multiple spindles.

Next came the advent of solid-state storage devices (SSDs) with the ability to automigrate heavy I/O loads from low- to higher-capability media and back down again as IOPS requirements dropped.

Finally, the fourth was good old Moore’s Law, or a derivative of it, that drove down the cost of storage hardware to the point that remote replication to a distant site became economically feasible, if not justifiable, as data started its explosive growth to 30 percent or more per year. At the same time, customer service and compliance awareness increasingly demanded higher levels of availability through quick recovery.

These game changers made many of the design concerns and philosophy that had gone into the old bundled tiers based on protection and performance obsolete. Replication and snapshots became something that could be available on high-end, midlevel, and low-end arrays. The requirements of mirrored disks for write efficiencies were eliminated by the ability to stripe across the frame, and the ability to restore and recover from a single disk failure was similarly enhanced through the new striping paradigm.

Performance issues could now be addressed by migration within the storage array across different disk/bus technologies — from cheap SATA to extraordinarily expensive SSDs. Indeed, just a 1 percent SSD holding could accelerate the performance of that frame by significant increments.

Out with the old, in with the new

These technology advancements are driving a sea change in storage tiering philosophy. The old days of a fixed menu of four or five tiers of performance and protection bundles are giving way to an à la carte menu or buffet of available services. Performance characteristics and protection can now be each uniquely selected, then bundled into a consumable service. Performance no longer needs to be concerned about RAID configurations or even disk technologies if autotiering is implemented. An application no longer has to be allocated to a specific tier of storage based on its recovery requirements.

Recovery requirements can instead be policy-based, and that policy can be executed on any of the storage frames on which the application or its components reside. With distanced and local replication commonly table stakes in many data centers, applications can now map their tables, logging files and images on the appropriate storage array without concern for dependencies triggered by a sequenced disaster recovery. This is very much unlike the old days when only tier 1 was replicated and when ensuring all applications in that tier were on the same storage device was critical.

Today, storage tiering philosophy is driven by three mission-critical criteria:

First, there is the need to understand scalability of the applications being serviced. How many ports will they consume? What is the current rate of port consumption? Can the planned target array handle that growth? What is the next step in scaling connectivity for that storage frame?
The second is performance. While most consumers still cannot empirically define their requirements in terms of IOPS, latency, or bandwidth, these metrics are key components of the new storage tier, if it is to be accepted by the consumer community. Where autotiering within the frame has been configured, publication of a blended performance rate and a lowest and highest threshold should suffice. The scalability and performance attributes are the tier cost and service differentiators.
The third criteria, data protection, is now a service selection option rather than a tier selection option. Policy-driven protection for operational and disaster recovery can be configured (and often is today) at the application level rather than the storage frame. This means a rigorous project introduction process, as well as change and release management processes, but nonetheless, different protection levels of RPO/RTO combinations can be applied to most arrays across the various tiers of scalability/performance using today’s technology. Like protection, other characteristics such as encryption and immutability can be addressed in a similar manner.

Here comes the cloud

It’s important to understand yet another trend, as cloud services become more prevalent. The cloud’s chief characteristics are the ability of the service consumer to self-configure elastic storage, with the provider automating the provisioning process and, most important, generating a bill. Without billing, the internal cloud is like a three-legged stool with one leg missing. All the motivators for cost-efficiency and service improvement disappear without a financial incentive.

The final challenge in today’s tiering philosophy is to develop and deploy a demonstrably accurate, fair, and transparent cost model and supportive billing process for chargeback or showback. This can be a challenge where multiple disks provide a single service tier under autotiering and technology has yet to provide orchestration and statistics at a level to make actual use billing feasible.

Blended costs are often the most appropriate answer today. Protection costs become relatively easy to calculate using a multiplier based on the additional storage that will be consumed during the protection retention cycle, perhaps loaded by some form or port or bandwidth cost. Encryption and immutability requirements provide their own unique but not unresolvable costing challenges, too. At the end of the day, in 2013, it’s critical for IT to know the unit cost of deployed resources and to whom those resources have been deployed.

New Tech Forum provides a means to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all enquiries to newtechforum@infoworld.com.

This article, “Modernizing the storage stack,” was originally published at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter.

Topics

About

Policies

Our Network

More

Modernizing the storage stack

As technologies like replication and RAID trickle down storage tiers, storage professionals gain broader options and greater control

Show me more

AI optimization: How we cut energy costs in social media recommendation systems

Cloud at 20: Cost, complexity, and control

Google adds vibe design to Stitch UI design tool

How to build desktop apps in Typescript with Electrobun

Write and run assembly in Python with Copapy

Run AI Models Locally on Your PC — No Cloud Required (LM Studio Guide)