The ROI of Benchmarking

analysis
Oct 4, 20069 mins

I've gotten a lot of emails since Part 3 of the recruiter series, so while I’m sifting through them, I thought I’d change the topic real quick to something else that’s just as important as bashing recruiters. I was just on the phone with a vendor and we were talking about what it takes to quantify benchmarks. Now, I’m not really talking about the results themselves, what I’m talking about is the process. How do

I’ve gotten a lot of emails since Part 3 of the recruiter series, so while I’m sifting through them, I thought I’d change the topic real quick to something else that’s just as important as bashing recruiters.

I was just on the phone with a vendor and we were talking about what it takes to quantify benchmarks. Now, I’m not really talking about the results themselves, what I’m talking about is the process. How do you quantify the ROI you get from even going through the process? When you ask for a benchmarking utility that is typically going to cost into the 10s of thousands, the higher-ups will want to see some ROI for their expense. I’m going to get to how to get to the point of having a quantifiable ROI in a minute, but first I’d like to talk about how stupid it is to even expect it to begin with.

The problem comes from such a complete lack of respect companies have for their DBs. It’s not like every piece of data they own is tucked away inside a filing cabinet and the DBs are just there to create more jobs. In every company I’ve been in for the past 10yrs, the DB has been the central storage mechanism for practically every piece of data in the company. If the DB goes down, everyone notices, and if you actually lose data, the world comes to an end. Yet, companies still treat DBs and DBAs like red-headed step-children. We’re quite often not given the budget to do what we need to keep our systems up and running, or monitored, or backed up, etc. Then when something goes wrong, it’s always our fault. This problem with benchmarking really highlights the poor attitude so many companies have towards their DBs.

Let’s look at it like this. When we ask for a benchmarking utility and a spot in the change process to actually perform even the most perfunctory test, we’re quite often told it’s too expensive and there’s no direct ROI. But show me the direct ROI for implementing anti-virus, or OS updates, or firewall security, etc. All of these things have been accepted as standard costs of doing business in almost every company in the world, yet get someone to sign off on benchmarking their DBs. It’s almost impossible. I was finally (after 3yrs) able to get it going at my last company, but it was a constant fight and I was the only one who was really on board. I know for a fact that they’re not still doing it, so there’s definitely not going to be any ROI now. But it’s an insult. Why do we have to show definitive ROI for a soft process when other groups don’t? I really can’t count the number of times I’ve sat in front of an IT manager with him asking me what returns they would see back from running benchmarks. Do you think that the anti-virus guys have to justify their stuff like that? I’ve never seen it.

And if you take the argument out of IT for a minute, there are lots of things we do because we know they’re right that we can’t actually quantify. Give me the quantifiable ROI on society of not beating your kids, or watching your weight, etc. The point is there are hundreds of things we do that we can’t actually quantify, but we know they’re smart… or at least right. Besides, show me one IT manager who knows the exact ROI for his anti-virus and firewalls and OS patches. There isn’t one.

So how do you go about quantifying your benchmarking process? Well, you have to do something that even the anti-virus and OS guys don’t have to do. You have to actually build a cost analysis for how much time you spend tracking down production problems that are due to poor design, bad indexing, inefficient queries, and the like. This takes a lot of time so don’t expect to have an answer overnight, but eventually, you can go to your bosses with quantifiable ROI that they can’t deny. I was at an Embarcadero presentation a couple years ago, and they showed us a Gartner study that revealed that over $5billion/yr is spent on fixing production bugs in application code in the U.S. This is 100% support cost.

What you do is keep track of the time you spend on issues that are related to performance and data quality. Then when you get to a point where you’ve been doing it for a couple months or so, and you have them all categorized into little subgroups, you can do some projections and go to your boss and show him in hard numbers how much of your time is spent supporting these issues instead of being the creative, forward-thinking DBA everyone knows you can be. The problem is, he may still say no. Go figure.

Now, here’s something that a lot of people don’t think of and I want you to go into battle armed to the teeth, so here’s an argument you might hear. If you’re putting in all this time on support, how much time will you spend on the benchmarks themselves? Well, I’m not going to lie to you. Initially, it’s going to be a lot of work as you learn the tool, and get things setup. You have to get the tests worked out so that they’re accurate, and get your reports in place so you can interpret them. It’s a lot of work, and many DBAs might not even be good enough to do it right. All the same though, what do you tell the boss when he asks that question? It’s simple. Be honest with him. Set his expectations in such a way that’s realistic. It’s not going to be a bed of roses in the beginning because it’s a new process and it has to be learned and improved. But once it’s rolling, it’ll take a lot less time. As you benchmark throughout the coming weeks, start writing a best practices doc that steers the developers and systems people away from the problems that arise from the benchmarks. So if you find that you perform better with locking hints, or getting rid of cursors, or not setting your DBs to auto-shrink, then you can add those things to your doc and publish it periodically. Once these standards are followed, you’ll find fewer things to cause problems, and performing the benchmarks will become perfunctory.

Here’s a short bulleted list of things that kind of summarize this piece (there may be a couple things I didn’t actually mention above but came to me now).

• Keep a lot of performance-related issues you worked on.

• Add up the cost of those issues including other groups you had to pull in.

• Don’t forget to add in a magic number for not meeting your SLAs. This number is something like an extra 10% cost based off of the customers’ perception that you’re not doing a good job because you missed 3 SLAs this month and they may go with someone else because you’re unreliable. So the 10% could either be lost business or the cost of a team of sales guys to go out and spend hours making promises to keep their business. This is a magic number. I use 10%, but you can put anything you like.

• Use a simple graph in Excel to project future support expense for these issues.

• When going to your boss, be honest about the effort involved. This is not a short term solution, but will see direct results in weeks to come. It will also get much easier. This is setting yourself up for success. Don’t become one of those IT shops that only fights fires.

• Make benchmarking part of your change control and your new implementation process.

• Have a list of things to test for. This will also go a long way in selling your boss on it. If you have a list of specific types of issues, it’s much easier to develop a plan to catch them in testing. Now you’re actually quantifying the process because you can certify that you can eliminate certain types of problems.

• As you run your different benchmarks, keep a list of best practices to send to your application people so they can include the performance changes in their designs.

Benchmarking Helps:

• Capacity planning – you know the max load your system can handle and you know what it looks like when it’s put under stress. This is an excellent way to plan for the future because if your box starts showing these signs it’s much easier for you to make a plan of action.

• Baselining – this is almost never done anymore, but even a perfunctory benchmark will give you some good baseline stats. This way, even a new application can be troubleshot a lot easier because you have something to compare the current stats to. If you’ve never taken stats, then how do you know if what you’re seeing is out of whack?

• Reduced support cost – I really don’t have to go over this one anymore do I?

• Build reliable standards – As the process matures you begin to build standards that work for your company’s processes.

• Increases developer productivity – As you find issues in the code and make the developers aware of it, it increases their awareness of how things interact. Too many times devs write code without a single thought to how it’ll play in the production environment. So this will increase their awareness and prevent them from having to go back and release patches. And this frees them up to write more new code instead of supporting old code. See, it’s good all the way around.

Anyway, that’s all I’ve got for now. These are just some quick ideas and this isn’t meant to be a whitepaper on the benefits of benchmarking. Take this and do with it what you will, but at least now it’s being said.