Lies, Damned Lies and Benchmarks, Yet Again

analysis

Dec 3, 20073 mins

I've noticed (well, who wouldn't?) that Randall Kennedy (RCK) is in a kerfuffle about Nick White, a Microsoft Vista Product Manager, who blogged a relatively mild discussion of "The right time to assess Windows Vista's performance." So, at the risk of antagonizing everyone involved for no benefit whatsoever, let me offer another perspective. RCK wrote the benchmarks used by www.xpnet.com. Recently that organizat

I’ve noticed (well, who wouldn’t?) that Randall Kennedy (RCK) is in a kerfuffle about Nick White, a Microsoft Vista Product Manager, who blogged a relatively mild discussion of “The right time to assess Windows Vista’s performance.” So, at the risk of antagonizing everyone involved for no benefit whatsoever, let me offer another perspective.

RCK wrote the benchmarks used by www.xpnet.com. Recently that organization compared release candidates of Windows XP SP3 and Windows Vista SP1, and found Vista lacking. That shouldn’t really come as a surprise to anyone, but I’m not sure that it’s the issue here.

Nick White’s blog post points out that Microsoft only publicly benchmarks products once they have been released to manufacturing; that has been true to my knowledge for at least 20 years. When beta testers had to sign a non-disclosure agreement to work with pre-release Microsoft products, one of the key terms of the agreement was always a ban on publishing performance numbers prior to product release.

I can remember lots of products that had performance issues right up to the final release candidate that testers got to see, but were fine when released to manufacturing. So Nick has a point: a release candidate is not the right build to benchmark if you want to understand the performance of an OS. You need to wait for the RTM bits.

RCK commented to Nick’s blog in high dudgeon, about being attacked. But was he attacked? Nick never mentioned xpnet or OfficeBench or RCK in his post, so I’d say no.

RCK certainly interpreted the post as an attack. I read the posting more as being a little defensive, but hardly an attack.

Nick talked a lot about Principled Technologies, which did some Vista benchmarks for Microsoft last year, and Nick suggested that their benchmarks had been done properly. I’m not so sure about that: when you know the results your client wants to get, it’s easy to pick tests that will produce those results, whether you consciously mean to or not. Given the variance in results between the two sets of benchmarks, I’m not surprised that RCK feels defensive.

As a benchmark writer myself (I’m responsible for the WinTune and PC Pitstop benchmarks), I’m here to tell you that no single set of benchmarks can ever tell the whole story. My benchmarks sure can’t, and I’ve really worked at them over the years; I rather suspect that neither OfficeBench nor the Principled Technologies benchmarks can either.

So can everybody please chill?

Software Development

by Martin Heller

Contributing Writer

Follow Martin Heller on X

Martin Heller is a contributing writer at InfoWorld. Formerly a web and Windows programming consultant, he developed databases, software, and websites from his office in Andover, Massachusetts, from 1986 to 2010. From 2010 to August of 2012, Martin was vice president of technology and education at Alpha Software. From March 2013 to January 2014, he was chairman of Tubifi, maker of a cloud-based video editor, having previously served as CEO.

Martin is the author or co-author of nearly a dozen PC software packages and half a dozen Web applications. He is also the author of several books on Windows programming. As a consultant, Martin has worked with companies of all sizes to design, develop, improve, and/or debug Windows, web, and database applications, and has performed strategic business consulting for high-tech corporations ranging from tiny to Fortune 100 and from local to multinational.

Martin’s specialties include programming languages C++, Python, C#, JavaScript, and SQL, and databases PostgreSQL, MySQL, Microsoft SQL Server, Oracle Database, Google Cloud Spanner, CockroachDB, MongoDB, Cassandra, and Couchbase. He writes about software development, data management, analytics, AI, and machine learning, contributing technology analyses, explainers, how-to articles, and hands-on reviews of software development tools, data platforms, AI models, machine learning libraries, and much more.

Show me more

Topics

About

Policies

Our Network

More

Lies, Damned Lies and Benchmarks, Yet Again

More from this author

Running agents with Amazon Bedrock AgentCore

Generative AI and the future of databases

AI-assisted software development with Amazon Q Developer

Agentic coding with Google Jules

OpenAI Codex rivals Claude Code

A brief history of AI

Qwen Code is good but not great

Retrieval-augmented generation with Nvidia NeMo Retriever

Show me more

Oracle adds pre-built agents to Private Agent Factory in AI Database 26ai

Stop worrying: Instead, imagine software developers’ next great pivot

JetBrains launches AI coding agent management platform

How to build desktop apps in Typescript with Electrobun

Write and run assembly in Python with Copapy

Run AI Models Locally on Your PC — No Cloud Required (LM Studio Guide)