matt_prigge
Contributing Editor

Why you should stage a data-loss fire drill

analysis
Apr 18, 20116 mins

Today's businesses breathe data like air. Have you really thought about what would happen if the lights went out and your data went away?

Our enormous appetite for data has bred an equally huge existential dependence on that data being available. It’s not just the big names on the Fortune 1000 who can’t live without info, either. Businesses as small as florist shops and veterinary clinics can’t get by without their delivery schedules and patient records — all of which are stored digitally.

That’s progress: Digital data allows us to work more productively with less overall waste. In the early part of the last century, you could’ve said the same thing about electric power or the telephone. They solved all kinds of problems for humanity — yet look at the chaos that ensues when they suddenly become unavailable in ways no one considered.

Stage your own data outage

Any time you develop a heavy dependence on anything, it’s important to take a step back and imagine what you’d do without it. Unless “quit immediately and move to Mexico” is your answer, staging a tabletop data loss fire drill is a good idea.

In today’s do-more-with-less culture, it might seem like an astounding waste of time to spend a few hours simply talking with your team and stakeholders about unlikely worst-case scenarios. After all, you make backups and test them (right?). What’s the point of imagining what you’d do in the improbable event that they fail at the precise moment they’re needed?

The point is you won’t know the answer to that question until you ask. Moreover, the chances of someone reading this will eventually encounter a multisystem failure that results in some form of permanent data loss are much larger than you might imagine. You only need to look at the widely varied contributing factors to any well-known air disaster to realize that even the most highly redundant safety systems can be defeated when holes in the Swiss cheese happen to line up perfectly.

Avoiding paralysis

One important goal of having a data loss fire drill is to avoid decision-making paralysis that sets in when the truly unexpected takes place. By simply running through various disaster scenarios verbally, you can plant the seed of your potential reaction to various kinds of failures. If the disasters ever transpire, you won’t be left staring at your cohorts without the faintest idea of what to do — you’ll have a game plan (however distasteful it might be) in place, and you’ll be ready to throw it into motion.

Many of these plans might involve digital data entry for massive amounts of paper or require the services of a data recovery specialist, so it’s important to identify exactly how you would do that and who you’d call for the task. The minutes and hours after the start of a major disaster are not a good time to open up the Yellow Pages.

Simple discoveries

Better yet, by talking through the impacts of data loss to productivity, you’ll inevitably discover some very basic and often entirely free actions you can take to reduce the impact of those worst-case failure scenarios.

In every single data loss emergency I’ve witnessed, someone has said something akin to “Man, if we had only done X.” In those cases, X could be as simple as having an automated report of the next few days’ orders or deliveries be pushed out to a flat text file on a workstation. If the ERP system bites the dust and it takes days to reconstruct the data, that one step could allow production and shipping to continue while you put the pieces back together. Without it, you might as well close the factory floor and send everyone home.

Just as often, I’ve seen a completely unplanned recovery method save the day. In one case, a mission-critical database had been corrupted and its backups had silently failed for long enough that the most recently available backup was nearly useless. By chance, a developer had grabbed a copy of the production database only a few days before and stored it on his workstation.

That happenstance saved the company’s bacon, but there’s no reason it should’ve been a mistake. That company now automatically dumps its databases to the dev environments daily — it not only allows the devs to work with recent data sets, but is an informal data protection layer that augments their backups.

Getting started

It’s important to start the discussion by assuming that any data protection scheme you have in place has already failed or can’t be exercised. If your data protection plans are really solid (if so, good for you), that might require a bit of imagination, but that’s the point of this run-through.

An example might involve a failure that requires you to restore from tape media housed in a safety deposit box at a local bank — generally a fairly secure spot. What if your failure takes place on a bank holiday? What if the bank was robbed that day and the FBI has it locked down for investigation? It doesn’t matter how you justify it or how good your plans are — there will always be a hole in them. It’s the sudden appearance of that hole you’re looking for.

What your organization does will largely drive your discussions about a data loss fire drill, but you might try to answer some basic questions. In general, it’s best to try to think as far outside of the box as you can.

  • What data can we not survive without?
  • If we lose access to this mission-critical data, do we have other sources to reconstruct it? If not, how can we make sure we do?
  • Are there subsets of this data that we can routinely make available to critical end-users through low-tech means? If there are, are end-users familiar with where it is and how to use it?
  • If our worst-case recovery plans involve calling on outside help, do we know what those resources are and how to contact them?

These might seem like cutouts from a Data Protection 101 textbook, but I’m frequently surprised at how often I run into organizations who never consider these questions and simply assume their backups will always be around. Don’t be caught flat-footed by an unanticipated failure — do the planning ahead of time.

This article, “Why you should stage a data-loss fire drill,” originally appeared at InfoWorld.com. Read more of Matt Prigge’s Information Overload blog and follow the latest developments in storage at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter.