by Ronald Riffe

Big data needs software-defined storage

analysis
Apr 9, 20147 mins

With demands for agility and capacity, storage systems can't be islands. IBM's Ronald Riffe explains how software-defined storage provides a broad, hardware-independent solution

The movement to smarten up every aspect of IT infrastructure rolls ever onward, from servers to the network to storage. We’ve deployed server virtualization and built automation frameworks to adapt elastically to changing workloads; we’ve also begun rebuilding our networks with SDN. Software-defined storage (SDS) is the next big trend.

SDS enables agile and elastic storage through automated processes that can adapt to changing I/O demands. In this week’s New Tech Forum, Ronald Riffe, IBM director of software-defined storage, walks us through what SDS means now and in the not-too-distant future. — Paul Venezia

Software-defined storage: A must for big data

The proliferation of mobile devices and instrumented enterprise assets is igniting a new data explosion, from which can be gleaned new analytic insights and, in turn, open new business opportunities. At the same time, big data is placing greater demand on existing infrastructures, driving a need for instant access to resources — compute, storage, and network — and creating a new imperative to adopt cloud technologies. The flexibility required simply can’t be obtained with a traditional hardware-centric approach.

Storage is a constant pain point for cloud deployments, but it has been largely ignored by IT organizations, which have focused their attention primarily on server and network virtualization. With capacity growth, application performance, and cloud-related issues challenging organizations, IT managers must improve storage efficiency with not just virtualization of their server infrastructure but also their storage environments.

According to a 2013 study conducted by EMEA research, storage provisioning and management is a significant bottleneck for 58 percent of enterprise cloud deployments. As a result, storage automation was identified as the top integration requirement for the initial release of cloud projects, as cited by 32 percent of organizations.

For those respondents who had already attempted to deploy a private cloud without an SDS (software-defined storage) infrastructure, an overwhelming 84 percent of them were planning some sort of hardware-independent storage virtualization system to support their cloud.

Abstracting storage services from underlying proprietary hardware through SDS can improve operational efficiency, provide transparent data mobility, and enable common-denominator management capabilities, regardless of the hardware used. SDS makes applications in the cloud more efficient by reducing management complexity and allowing them to scale inexpensively. A review of IDC’s Worldwide Software-Based (Software-Defined) Storage Taxonomy, 2013 report along with a consensus of storage vendors indicates SDS has the following key attributes:

  1. Software is at the heart of SDS. It’s designed to run on heterogeneous, commodity hardware and can even leverage an organization’s existing storage infrastructure.
  2. SDS provides a full suite of storage services.
  3. SDS federates physical storage capacity from multiple locations like internal disks, flash systems, other external storage systems, and soon from the cloud and cloud object platforms.
  4. SDS is easily programmable through a single, unified API that is available through a variety of portals

With reduced complexity, SDS reduces burden for administration. It simplifies, virtualizes, and automates storage services. It improves existing storage utilization and eases data migration between storage systems and storage tiers — even among vendors — thus reducing service delivery times.

Here are a few characteristics that are important to have in any SDS infrastructure:

SDS should be open. SDS should support broad client choice in the physical storage infrastructure and integrate with other virtual compute and cloud management software. This openness helps organizations migrate to an agile, cloud-based storage environment and manage it effectively without having to replace existing storage systems, generating dramatically increased value from existing investments. It offers capacity-based storage virtualization and automation, allowing customers to deploy the capabilities needed without licensing complexity.

SDS should be self-optimizing, intelligent, and policy-driven. SDS should adapt automatically to workload changes to optimize application performance, eliminating most manual tuning efforts. Automated tiering across different storage systems and virtual machine vendors and brands can optimize storage by automatically moving the most active data to the fastest storage tier.

SDS needs to be application aware. SDS should automate provisioning and improve productivity so that IT operations administrators can focus on overall storage deployment and utilization, as well as on longer-term strategic requirements — without being distracted by routine storage-provisioning requests. For example, the right storage solution would perform application-aware snapshot backups frequently throughout the day to reduce the risk of data loss.

The value of SDS in the real world

Consider this cloud and virtual infrastructure example: A financial services company is beginning to build a private cloud with software-defined infrastructure. It has virtualized its compute infrastructure and is enabling network and storage virtualization in the next phase. Its core business applications for credit card processing and virtual desktop run in a Dallas data center and a big data credit-risk application along with an application development cloud is running across the Beijing and Sao Paulo data centers.

The credit risk application for such an organization is a big data application that must process live data from various sources: credit card transaction data, credit score bureau data, personal customer data, Twitter feeds, and other publicly available social media data. The data-intensive credit risk application is running in the Beijing cloud and starts experiencing an I/O bottleneck. A good SDS infrastructure detects this issue and, through a policy shift to the storage layer, automatically provisions new flash storage. The SDS system then automatically shifts the relevant “hot” data to this new flash storage, improving I/O throughput, and maintaining the service-level agreement.

With SDS, data can be dynamically moved and seamlessly shared, storage capacity can be elastically scaled, and new performance tiers can be transparently introduced. In our example, the credit risk application overran the physical infrastructure it had been assigned. The SDS system responded, automatically provisioning new resources and moving the data as needed.

When SDS is done right, key benefits emerge:

  • SDS automates the use of on-premise and on-cloud storage resources
  • Policy-based orchestration of storage resources optimizes performance and efficiency
  • Analytics-driven optimization of software-defined resources can meet unpredictable business needs
  • Building on open APIs, tools, and technologies maximizes customer value, skills availability, and easy reuse across hybrid cloud environments

Vendors are already competing to claim the SDS space, each with their own approach. For example, IBM Virtual Storage Center enables automated, policy-driven storage tiering and virtualization of heterogeneous storage systems and can turn existing storage into private cloud storage without “rip and replace.”

A vital new component of the software-defined data center, SDS is still evolving — and will continue doing so at a rapid pace. According to a 2013 IDC report, “Software-based storage will slowly but surely become a dominant part of every data center, either as a component of a software-defined data center or simply as a means to store data more efficiently and cost-effectively compared with traditional storage.”

The right SDS solution is one that helps clients transition from traditional storage infrastructure to a more agile, cloud-ready, software-defined environment and manage it effectively. Deployed correctly, SDS simplifies and modernizes heterogeneous storage environments, uses analytic-driven data management to reduce the cost of storage, and standardizes advanced data protection capabilities across storage systems.

New Tech Forum provides a means to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all enquiries to newtechforum@infoworld.com.

This article, “Big data needs software-defined storage,” was originally published at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter.