by Jon Udell

Achieving translucency

news
Jun 20, 20033 mins

Should some data be hidden from owners and operators of databases?

If you’ve never searched Google for the 10-digit string that is your phone number, try it now. Are you surprised to see your street address instantly pop up, along with links to maps and an aerial photograph of your neighborhood? This information has long been available, but it used to be wrapped in a veil of practical obscurity. In the age of Total, er, Terrorist Information Awareness (TIA), that veil has been ripped away.

There are no secrets — there are only facts that are either easier or more difficult to find. So far, the information revolution has pushed inexorably the slider toward the easy end of the continuum. Forces that might shove it back toward the center include Stanford law professor Lawrence Lessig’s famous four constraints: law, technology, markets, and social norms. The privacy laws are slowly coming — TIA notwithstanding. So how can technology help make them easier to enforce?

Peter Wayner’s recent book, Translucent Databases , proposes an intriguing possible approach. The central idea is that some facts should be hidden even from the owners and operators of the database that stores those facts. The database is thus rendered “translucent,” such that where some operations and data are opaque and others are transparent.

A translucent database can, for example, encrypt sensitive facts protected by passwords held by the database users. Wayner shows how an online clothing store could retain a customer’s purchase history while denying itself — or any unauthorized database operator — the capability of mining that data for sensitive details such as the customer’s waist size. To achieve this effect, the database combines the customer’s name with a password known only to the customer and stores the one-way hash of that combination in a column of the purchase history table.

“If the clerks want to waste a Friday afternoon searching for the name of the customer with the biggest gain in waist size, they’ll be disappointed,” Wayner writes. Yet because product numbers and sizes are not encrypted, “the database is still useful to purchasers planning which sizes to order next year.”

This solution is more difficult for the database operator, but not as much as you might think. It does complicate the life of the user, who may be reluctant to assume the burden of an extra password just to safeguard against malicious disclosure of girth.

But sooner or later, we’ll have better ways to manage sets of credentials. Suppose translucent solutions can be made more nearly frictionless. How broadly applicable might this technique be? A clothing store, although it might profit from selling names to weight-loss clinics, does not really require personal identifying information. A medical clinic, on the other hand, really must link size to identity and must share patient records with other providers.

As we shift to an economy based on access to networked services more than on ownership of goods, translucency will be harder to achieve. Identity, after all, is a condition of access to such services. Even so, when customer data need not necessarily be personalized, translucency is a powerful technique that can meet your requirements, satisfy your customers, and keep the feds happy too.

(For more on identity management and privacy, return to “Does identity management clash with privacy?”)