matt_prigge
Contributing Editor

In search of the jack-of-all-trades

analysis
Jul 6, 20105 mins

As converged, virtualization-based infrastructures become more common, who holds the keys to the infrastructure?

As server virtualization becomes ubiquitous, it’s having a profound impact on the way we organize roles and responsibilities in IT. Virtualization’s tight integration with all levels of the infrastructure renders the old paradigm of siloed server, network, and storage administrators obsolete. Nowhere is this more obvious than when planning for the capability to fail over to a secondary site.

The falling cost of replication-ready SAN hardware combined with increasingly feature-rich virtual site failover tools such as VMware’s SRM (Site Recovery Manager) has made the prospect of constructing a warm site appealing to a much broader audience. In the past, setting up a warm site might mean re-engineering the production site infrastructure to allow for SAN replication and to capture application state from physical machines that might not already be on a SAN.

[ Also on InfoWorld.com: Learn how data deduplication can slow the explosive growth of data with Keith Schultz’s Deep Dive Report. | Looking to revise your storage strategy? See InfoWorld’s iGuide to the Enterprise Data Explosion. ]

In environments where server virtualization is already implemented — and on a SAN that can support replication — implementing a warm site is often a simple issue of being able to afford the extra hardware, software, and WAN bandwidth. That relative ease of deployment, combined with lowering the odds of downtime if a sitewide failure occurs, has led enterprises that never would have considered building a warm site to jump right in.

As more enterprises pursue warm site deployment, they often have trouble figuring out who to assign the project to. To be sure, any project that involves the implementation of a new secondary datacenter complete with server, storage, and networking gear is going to require the attention of experts in all of those fields. But in organizations where there are still dedicated server, network, and storage specialists, the result is often less than ideal.

On its face, virtualization would appear to be a server-based technology. To a large extent, it is: The hypervisor is there to virtualize servers, after all. But to do that, it also implements its own virtual networking stack and is increasingly integrated with the storage it runs on. It’s certainly possible to build a server virtualization environment by drawing on the skills of networking, server, and storage specialists. You might make the same assumption for a warm site: The network techs come in and lay the WAN and LAN, the storage techs implement a SAN and configure replication, and then the server techs come in and get the secondary virtualization cluster built. Right?

Not quite — if you’re using VMware’s popular vSphere hypervisor and vCenter virtualization management framework, chances are that you’ll at least consider implementing VMware’s SRM site failover management software. At its heart, SRM is really an automation tool. All of the tasks it does, you can do manually without it. That said, it saves a tremendous amount of time by automatically building a site recovery script that can be executed or tested with the click of a single button. With many regulatory auditors starting to expect organizations to demonstrate their failover capability, it’s becoming much more critical to have.

Among other things, SRM automates the process of cloning or promoting replica volumes on your secondary site SAN and mounting that newly available storage on the secondary site virtualization hosts so that the recovery process can proceed. That by itself blurs the line of operational responsibilities between server and storage administrators. Of the dedicated storage administrators I know, very few would be happy about an automated tool over which they have no control logging in to their SAN environment and making changes.

That’s not where the administrative control challenges end, either. Another thing SRM can do is dynamically re-provision recovered virtual machines to use different network settings to match the network in place at the secondary site. Yet how many of us can say that we can run into our server rooms and change the IP address of every server without disaster ensuing due to a hard-coded IP in some long-forgotten custom code?

Sure, you can find and fix every potential problem that might be caused by a hard-coded IP, but who wants to do that? A better plan is to make the net block that the virtualized servers run on portable from site to site. It’s arguably easier to pick that entire subnet up and move it across the WAN to the secondary site through the use of a dynamic routing protocol than it is to change the addresses of every server that is to be failed over. (Try and explain that to your network architect if they haven’t designed the WAN to accommodate that.)

You certainly can implement this kind of failover and keep the boundaries between server, network, and storage administrators intact. But the end result will always pale in comparison to an implementation that was designed looking at the big picture.

The impact of the convergence of server, storage, and network infrastructures is hard to predict, but it’s already starting to result in a much greater demand for engineers with generalized skillsets. Staffing and maintaining competencies in the face of a fully converged infrastructure is huge challenge that I don’t think many organizations have fully wrapped their minds around yet. The old days when a storage admin could be blissfully ignorant about the servers and the network are long gone.

This article, “In search of the jack-of-all-trades,” originally appeared at InfoWorld.com. Read more of Matt Prigge’s Information Overload blog and follow the latest developments in data storage and information management at InfoWorld.com.