paul_venezia
Senior Contributing Editor

The fuss over accountability for bad code

analysis
Jul 25, 20115 mins

Who should pay when poor coding damages customers? The hornet's nest poked two weeks buzzes on -- and an example of lazy coding helps clarify the issue

I really wanted to write about something else this week — say, about how VMware is aiming a shotgun at its feet with its new licensing scheme — but my last two posts (“It’s time to make poor coding a felony” and “Why those guilty of bad code must pay“) have taken on a life of their own.

I continue to be inundated with opinions from all sides about the issue of accountability for customer data leaks. At first these comments were mostly negative, but the field has broadened. At this point, as many people seem to agree with me as think I’m a lunatic or not very bright. As far as I can tell, the majority of that latter camp either didn’t really read my posts or are in fact the very same development managers who approve and push products with glaring security holes at their core.

[ Join Paul Venezia for a Twitter chat, hashtag #acctablecode, on Thursday, July 28, from 2 to 3 p.m. ET. | Also on InfoWorld.com: Read the posts on accountability for bad code that started it all: “It’s time to make poor coding a felony” and “Why those guilty of bad code must pay.” ]

In fact, these two posts have garnered so much attention that I’m going to run a “live” Twitter conversation about this very subject on Thursday, July 28, 2011, from 2 to 3 p.m. ET. We’ll be using the hashtag #acctablecode, so if you want to be part of the debate, fire up your Twitter client of choice and dig in.

As this discussion rolls on, allow me to provide a specific example of what I’m talking about from a ground-level perspective. Consider these two MySQL query samples (let’s assume the input variables have been sanitized):

SELECT id, email, passwd, status, first, last FROM users WHERE email = '$sessemail' AND passwd = SHA1('".$form_password."');

SELECT id, email, passwd, status, first, last FROM users WHERE email = '$sessemail' AND passwd = '".$form_password."';

The only difference between these two queries is that the first uses SHA1 hashing on the password and the second doesn’t. While using an SHA1 hash on a password won’t guarantee the password can’t be cracked, it makes it much harder for anyone working with a dump of this database to retrieve the actual password. Even if it took only 10 seconds to match an SHA1 hash for a terrible password like “mypassword,” you’re still lengthening the time required to reveal the user’s password by 10 seconds over the plain-text equivalent, which is zero seconds. Multiply that 10 seconds by the thousands or millions of passwords in a compromised database, and the likelihood of a widespread identity theft problem stemming from a single exposed database drops dramatically.

If the second query is used, then the passwords are immediately available to anyone who gets his hands on a copy of the database, no cracking required. If the first query is operating, and the user employs a strong password like “g7#$gg567” with SHA1 hashing, very likely no one will crack it because it would take far too long to be worthwhile.

MySQL (and other databases) include built-in hashing functions for purposes just like this. The impact on application performance is vanishingly small. The time required to implement simple hashing over plain-text storage is measured in a few minutes in total — enough time to add the SHA1 function call to the password storage and matching query. That’s it.

If you want greater protection, add several layers of SHA1 and maybe MD5 hashing to the password. It might look like this: SHA1(MD5(SHA1('".$form_password."'))).

Adding multiple layers of hashing won’t make it 100 percent secure, but it’s a very, very simple way to provide a significant level of protection to sensitive information like passwords, and it carries nearly no performance penalty. Tack on a fixed salt (a string that’s not found in the database) or a simple substr (or both!) to the password and you add an even greater level of protection — with nearly no time investment to speak of.

On an extremely small-spec Linux VM (one vCPU, 512MB of RAM), I wrote a tiny bit of PHP to create 500 randomized usernames and passwords, 4 to 8 characters for usernames, 6 to 12 characters for passwords. I then inserted them into the local MySQL database with and without varying levels of hashing. The result was that it took 0.019 second longer to store 500 SHA1(MD5(SHA1)) strings than it did to store the same data in plain text. That’s well within any reasonable margin of error, and it’s completely unnoticeable.

These are just minor examples of exactly how simple it is to use built-in database hashing functions to allow for the storage of hashed passwords instead of plain text. This is about as simple as it gets — yet we continue to see large-scale apps that simply don’t use hashing or any form of password security.

Nonetheless, I’m still getting hammered about the position I’ve taken: that those writing code exposing plain-text passwords (or any of a number of equally egregious coding practices) should be held responsible when their awful practices enable user data to be easily stolen and turned against those users.

Maybe I just don’t get it. Maybe you’ll log on to Twitter on Thursday and tell me exactly why I’m wrong.

I’m all ears.

This story, “The fuss over accountability for bad code,” was originally published at InfoWorld.com. Read more of Paul Venezia’s The Deep End blog at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter.