The futility of developer productivity metrics

analysis
Nov 17, 20115 mins

Code analysis and similar metrics provide little insight into what really makes an effective software development team

Good programmers are hard to find. Worse, qualified candidates can be expensive. Little wonder, then, that software project managers want to be sure they’re getting their money’s worth. They want metrics to demonstrate how each developer’s output compares to that of the others. But while this sounds good in theory, is it really possible to quantitatively measure something as arbitrary as developer productivity?

IBM thinks it is. Earlier this month, Pat Howard, vice president and cloud leader for IBM Global Services, explained how Big Blue had developed a scorecard system that awards points to developers based on a number of quantitative performance metrics. The developers with the highest scores, he says, gain the best reputations around the company. The trouble is, such ratings systems are seldom truly legitimate.

The oldest and most obvious metric for software development is to count lines of code: How many lines has an individual developer or team produced, and how much time did it take? In the PBS documentary “Triumph of the Nerds,” Microsoft CEO Steve Ballmer observed that in the 1980s IBM seemed to have made “a religion” out of this metric.

But lines of code is also the metric that’s easiest to debunk. In virtually any programming language, it’s possible to write the same algorithm a number of different ways, using various syntactic constructs. Some methods will inevitably be more compact than others. On the other hand, some languages require more bookkeeping and boilerplate code than others — lines that are essentially wasted space. Most important of all, the length of a program’s source code tells you virtually nothing about its quality.

Fortunately, IBM’s current metrics are somewhat more sophisticated. Big Blue uses a source code analysis tool from Cast to compare code produced by IBM developers with known industry best practices. The code is rated for performance, security, and complexity, and the developers whose code rates the highest receive the highest scores.

It certainly sounds empirical. And yet, no matter what methods you use to evaluate programmers’ code, you’re still missing the broader picture of what real-world developers actually spend their time doing.

Development by the numbers Code metrics are fine if all you care about is raw code production. But what happens to all that code once it’s written? Do you just ship it and move on? Hardly — in fact, many developers spend far more of their time maintaining code than adding to it. Do your metrics take into account time spent refactoring or documenting existing code? Is it even possible to devise metrics for these activities? (Counting lines of documentation makes even less sense than counting lines of code.)

Similarly, no development team exists in a vacuum. Are developers who take time to train and mentor other teams about the latest code changes considered less productive than ones who stay heads-down at their desks and never reach out to their peers? How about teams that take time at the beginning of a project to coordinate with other teams for code reuse, versus those who charge ahead blindly? Can any automated tool measure these kinds of best practices?

It’s also important to recognize that any new code can have lasting side effects, both good and bad. A piece of code might be objectively “high quality” yet still cause problems in the long term. The code might be difficult for other developers to understand, reuse, or interface with. It might require extensive, unforeseen changes in other sections of the code base. It might place additional demands on the IT and operations staff who maintain the running application. Again, it’s far easier to measure such things by intuitive methods rather than quantitative ones.

And can metrics account for productivity sinks related to unforeseen circumstances? What about code that grows longer and ever more convoluted due to scope creep — how is productivity measured then? What about code that is functional, high-quality, and delivered on time, but doesn’t do what it’s supposed to do because of simple miscommunication — who should be penalized? How well do the metrics account for delays due to budget shortfalls, bugs in tools or platforms, unmet dependencies from other groups, or dysfunctional processes?

Management, not metrics All of these conditions and activities are of vital importance to the success of any software project, code quality notwithstanding. What’s more, they have a direct impact on a developer’s self-worth and job satisfaction — areas where productivity metrics based on source code analysis fail utterly.

IBM claims its metrics aren’t used punitively. Howard describes IBM as “a continuous learning environment,” where low productivity scores are a signal to provide more training, rather than penalize individual developers. Yet he also describes IBM developers as people who want to be known as “the greatest on planet Earth” and says IBM’s system allows them to “walk around with a scorecard” — suggesting a locker-room atmosphere that’s sure to alienate employees who don’t view their work in such competitive terms.

Perhaps more telling, Howard says IBM’s scoring system “has really wrapped our worldwide community together in a way that we didn’t anticipate.” For a global company like IBM, which maintains a massive workforce both at home and offshore, with employees coming from multiple cultures, languages, backgrounds, and work environments, maybe it does make sense to try to reduce developers to a few quantifiable traits (salary, no doubt, being an important one).

For most other companies, however, it might be best simply to forget about the idea of measuring developer productivity and rely instead on tried and true methods. That’s right: A highly effective, productive developer workforce is often the result of high-quality, effective management. Unfortunately, nobody has developed a metric for that yet. Coincidence?

This article, “The futility of developer productivity metrics,” originally appeared at InfoWorld.com. Read more of Neil McAllister’s Fatal Exception blog and follow the latest news in programming at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter.