Storage vMotion is often derided as an unnecessary feature you'll never need -- don't believe that for a second An enormous chunk of the money we spend on IT infrastructure goes to avoiding downtime — from redundant power supplies and disks to clustering and redundant data centers. Yet we happily spend it. Lost productivity and shattered user confidence have a damaging effect on the ol’ career prospects. Of all of the advances over the past 10 years, server virtualization has probably done more than any other technology to decrease exposure to downtime. What server clustering once did for single database platforms, virtualization clusters can now do for any application you can run on them — which will soon be just about everything. Technologies like VMware’s vMotion, which allows a virtual machine to be moved from one virtualization host to another without any perceptible downtime, and VMware High Availability (HA), which allows automated recovery from host failure, have gotten a lot of attention for their ability to eliminate or substantially decrease downtime windows. But if you ask me, one weapon in VMware’s arsenal doesn’t get the attention it deserves: Storage vMotion. The Storage vMotion proposition Storage vMotion (sVmotion, for short) allows you to move the underlying storage resources of a virtual machine from one volume or device to another without incurring downtime. Both the source and destination storage must be accessible from the same VMware vSphere host, so you can’t use this feature to move a VM’s storage and what host it’s running on at the same time — which means it isn’t much use in environments that depend on direct-attached storage. You can almost think of Storage vMotion as normal, host-to-host vMotion, but instead of moving the running location of a virtual machine from one host to another without moving the storage (which is assumed to be on a SAN), it moves storage from one volume to another without changing the running host. While I’m focusing on VMware’s product offering here, it’s worth noting that other virtualization vendors offer similar functionality with one catch: sVmotion doesn’t incur any noticeable downtime. Others, such as Microsoft’s Storage Migration, found in Hyper-V 2008 R2, generally suspend the operation of the virtual machine for long enough to disrupt TCP connections — an event for which you’d need to plan an outage window. sVmotion can almost always be used smack dab in the middle of the day without anybody being the wiser. The trade-off is that it is only found in the Enterprise and Enterprise Plus editions of VMware vSphere, which are much more expensive than competing solutions. It’s a classic showdown between cost and functionality. To illustrate ways that Storage VMotion can save your bacon, I’ve thrown together a few use cases that have all come up in real IT shops I work with. It’s worth noting that all of these examples arose in the past two weeks alone, so these aren’t isolated incidents. The new SAN One of the most obvious benefits of downtime-free storage migration is in the planned migration of storage resources from one SAN to another. In this case, a municipality had just purchased a pair of new SANs to implement site failover capability. Each of the two data centers involved already had virtualization environments running with SAN platforms that weren’t capable of replication. Getting the new SANs installed, configured, and connected to the hosts was a relative snap due to the presence of normal host-to-host vMotion. As new hardware was added to the existing hosts, virtual machines were vMotioned to other hosts, allowing modifications to be made without impacting production. Once the hosts could see both SANs, it was literally as simple as a few clicks to start the virtual machines moving from one SAN to the other. The only downtime required in the entire project was the few hours it took the migrate the data attached to the only remaining physical host in the network. The rest of it was done online and during business hours with zero user impact. Due to the amount of data that needed to be moved, this project would have required a combined total of about 40 hours of downtime to perform manually. Not only would this have come at a massively increased effort and labor cost, but it also would have caused significant disruption to the municipality’s critical fire and police dispatch systems. The failed SAN In another case, a large enterprise had suffered a SAN failure that triggered a failover to a secondary SAN. Fortunately, synchronous replication was in use, so the entire event had little lasting impact beyond the time it took to effect the transition. After the original problem was identified and corrected, that left the task of moving the data back to the original production SAN. This process was made less complicated by the fact that the synchronous replication could simply be reversed and a controlled SAN failover could be initiated. However, doing this required the connected hosts to be offline for the short period of time it took to fail over each storage resource. For the physical server clusters in the environment, this meant 20 or 30 minutes of downtime apiece to shut down, fail over the volumes, and start up again. For the virtualized environment, the same approach could have been used, but would have been much more painful as the environment hosted several hundred virtual machines versus the 10 or so physical servers. However, since Storage vMotion was available, it was possible to provision completely new volumes on the original production SAN and migrate the VMs from the volumes on the secondary SAN to the primary SAN without incurring any downtime at all. The irony here was that the physical systems were originally implemented in such a way to ensure the best possible uptime for the mission-critical systems they host. Ultimately, they experienced more downtime than the less-critical virtualized systems because of the fact that they didn’t have the capability to leverage the features found within the virtualized environment. There’s some food for thought. The corrupted VM In this case, a single virtual machine had suffered significant data corruption as a result of a failed application upgrade. Normally, the fix for this would be to perform a restore from backup taken prior to the upgrade. However, since the environment had been built on top of a SAN that was configured to take regular snapshots of its volumes, it was possible to recover from one of these instead. The SAN in question allowed the snapshot to be presented to the hosts while the original volumes were also presented. Within just a few minutes, the failed VM had been removed from the virtualization infrastructure, the snapshot presented, and the snapped version of the VM added into the infrastructure and powered on. All in all, this was far, far faster than restoring several hundred gigabytes from backup; it’s also a testament as to why you should use SAN snapshots if you have the ability. However, this left the infrastructure running in a precarious state. The VM was still running on the presented snapshot, and it would take a significant amount of downtime to migrate it back into the live SAN volume on which it had been running. With Storage vMotion, it was possible to migrate the VM off of the snapshot and back into the production LUN without further downtime. Putting it all together The massive amount of operational flexibility that vMotion and Storage vMotion give you can’t be understated. However, it’s easy to get sucked into believing that this flexibility just makes your job easier by letting you perform maintenance during the day instead of scheduling off-hours downtime windows. The reality is that these features can massively decrease the amount of time it takes to deal with both the expected and the unexpected. While you might not need a live storage migration tool like VMware’s Storage vMotion all of the time, it’s worth every cent when put in use. This article, “Avoid downtime with Storage vMotion,” originally appeared at InfoWorld.com. Read more of Matt Prigge’s Information Overload blog and follow the latest developments in storage at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter. Technology IndustrySoftware Development