If only the legislators who determine data retention requirements actually thought about data Data retention is an area that many administrators try to avoid thinking about because it is so problematic. This is partly because requirements for data retention are being set by lawyers and politicians, who may not understand the implications of the regulations and policies they blithely set in place. Content addressable storage, and other write-once systems that automatically keep data for set periods of time according to policy, are tailored to address legal requirements around archiving. But these systems work only if the data retention period is clear-cut and unambiguous. For instance, health care organizations may find that they have several conflicting requirements for retaining data, one that says patient data must be retained for seven years and then destroyed, one that requires holding data for three years, and one that requires retaining data for the life of the patient or longer. The last one in particular presents some problems. Although it’s tempting to archive data to tape and forget about it, what happens if you actually need to access that data in 20 years? Think about how data was stored 20 years ago, and whether you’d be able to access it now if you needed to retrieve the data. Do you have access to any device that will read a 5.25-inch floppy or 9-track tape? If you could read the media, will the data format or application format be readable? Try importing a PC-Write document from 1988 into Microsoft Word. Imagine how much harder that will get in another 50 years. To put that another way, do you have a paper tape reader lying around? Many organizations are deciding to keep data online or near-line indefinitely to address these issues. Rather than pull archived data off tape every few years and renew it to newer media, they simply keep the data on disk. After all, storage capacities continue to increase at a rate that renders old data relatively trivial in size. The first hard drive, born in 1956, had a capacity of 5MB, drawing on 50 24-inch platters. Five megabytes is a drop in the pool of today’s 1.5TB drives. In another 50 years, today’s 50TB of archived data may seem trivial too, if capacities continue to increase at anywhere near the same rate. What will a 450-petabyte “drive” look like? Of course, finding the data, finding the encryption keys from 50 years in the past, and deciphering the application format will remain problems. Maybe it’s easier to just print everything out and stick it in a filing cabinet.