Take a hard look at your IT infrastructure and see what needs to be red-lined Every IT shop has its own way of doing things, but one thread ties them together: They all feel as though they have more work to do than can possibly be accomplished in the allotted time. Experts seem to think the economy is thawing out a bit, and some IT departments are starting to increase headcount, but in most cases staffing levels are just now starting to reach levels they should have hit years ago, falling short of current needs. A good chunk of what I write about here on InfoWorld are infrastructure management tasks that I consistently see left behind in the wreckage of ambitious application rollout schedules. Though implementation timetables may be met, the price of skimping on good, old-fashioned infrastructure management and planning often gets paid later — in the form of infrastructure instability, avoidable human error, and unnecessary trips to the corner office for capital or contract labor. If you’re responsible for enterprise IT infrastructure, take a look at this list and see how you stack up. If you’re on top of everything, count yourself extremely lucky, but if you aren’t, don’t feel too bad. You’re in good company. Planning Anyone who must submit a budget for the next fiscal year has to do at least some kind of technology planning. You can’t have any idea how much capital to ask for if you have no idea what you will need to buy. But how often is your budgeting accurate? Do you know what your primary storage infrastructure will look like in two years? Three? Do you know when your backup infrastructure may need more resources? How about when you’ll need to expand your virtualization cluster? Have you recently bought a piece of hardware or software you had to replace before you thought you’d need to replace it? Monitoring and trending Let’s say something goes bump in the middle of the night and critical systems go offline for 15 minutes — then come back online on their own. When will you find out that it happened? Via a page that gets you out of bed as it’s happening? The next morning when you check your email? Or might you not know until it happens again two weeks from now, except in the middle of the day? Once you know that those systems went down, what tools do you have in your arsenal to help you determine why? Do you have systems in place that will allow you to correlate logs and performance graphs from different systems at the same time to quickly determine a root cause? Testing Every IT infrastructure depends on redundancy or backups to keep things rolling in the event of unplanned failure. Nonetheless, there is a very strong relationship between the frequency that any of those protective measures are tested and how likely they are to work in the event that you actually need them. The cold hard truth: You can’t depend upon something you don’t test. Disaster preparedness Having well-tested redundancy and backups in place is a huge part of being prepared for disaster, but not all of it. For example, you may be prepared to bring your systems back online if you encounter a range of different disasters, but do your users know what to expect? One way to get them on the same page is to hold tabletop disaster recovery exercises with nontechnical stakeholders — and give those users a chance to plan what they’d do when the lights go out. It will make it far easier to concentrate on executing your recovery plan during wartime. There’s also a lot of work you can do to make it easier to navigate an unplanned outage — such as building a dedicated management network that allows you to troubleshoot outages more easily. Patching and upgrades Everyone knows that patching servers (especially Windows systems) is critical. The same can’t be said for other network-attached hardware like SANs, network devices, and built-in management controllers, some of which languish for years between patches. Security vulnerabilities for network and storage devices rarely get attention in the press, but that doesn’t mean hackers are unaware of them — so patch away! Internal security Almost every infrastructure has some sort of edge security. But what about internal security? Have you segmented your internal network such that core infrastructure services and devices (virtualization hosts, SANs, and so on) are unavailable to users who don’t need access to them? If not, you leave open the possibility that a low-impact security lapse could mushroom into a serious risk to the entire infrastructure. Documentation and cross-training Documentation is a dirty word. But having good, practicable documentation can save your bacon, especially if you work in a team where responsibilities are divided among different individuals. If disaster strikes when the “network guy” is on vacation, can the remaining team members find the information they need to solve the problem? I’ve seen lapses as simple as a mislabeled (or unlabeled) cable cause hours of troubleshooting that could have been avoided. Even a little bit of documentation can go a long way toward avoiding self-inflicted wounds. Professional development Of all the things that fall off the back of the truck when things get busy, professional development is often first to go — when it should be one of the last. No one who’s been in IT for very long needs to be told how quickly things change and how much work it takes to stay on top of what you’ve already deployed, much less what you might consider adding in the near future. When presented with a new system to implement, we’re often forced to do the equivalent of mashing buttons to meet a service delivery timetable, frequently leaving out the critically important familiarization that’s necessary to understand how the system actually works. It’s like an airline pilot who knows what only one-quarter of the switches in the cockpit do: He might not need the other three-quarters to take off and land under normal circumstances, but if something unexpected happens, all bets are off. Putting it all together For many years I’ve worked with IT departments of all shapes and sizes, and I’ve seen these infrastructure management activities fall by the wayside again and again, often witnessing the unfortunate results firsthand. The truth is that for items that don’t rise to the level of emergency or C-level priority, you need to make the effort to convince management it’s truly necessary. That’s the only cure to living in constant dread of the unexpected. If that describes you, don’t waste any more time. Start lobbying now. This article, “Feel overworked? Here’s the evidence,” originally appeared at InfoWorld.com. Read more of Matt Prigge’s Information Overload blog and follow the latest developments in storage at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter. CareersTechnology IndustrySecurity