Whether at home or in the data center, chances are your deleted data is easily available to others A key facet of the enterprise data explosion is the enormous disparity between the effort we expend on generating data compared to managing it. That isn’t terribly surprising. In most business contexts, the generation of data goes hand in hand with the generation of revenue. The one element those endless piles of invoices, part drawings, production records, marketing proofs, and product photos have in common is that they were all created to chase after revenue. However, the revenue benefits of managing data are not as easy to prove — and almost always require more effort than companies want to expend. But there are benefits. For example, if you took all those invoices and put them into an accounting package, you could improve the efficiency of your bookkeepers and cut costs. Or if you built a content management system to house your product photos, you’d marginally decrease your storage requirements and stave off a primary storage upgrade for a few more months. Although managing data after it has been created is less sexy than making it in the first place, it can be worthwhile when done correctly. Don’t overlook deleted data The one part of data management that has absolutely no fun associated with it is handling the very last part of data’s lifecycle: its deletion. People seem to avoid deleting data like the plague — the thought of accidentally deleting something that might be needed inspires terror. Then when data does need to be deleted, it’s frequently not deleted correctly or thoroughly. The danger of allowing supposedly deleted data into the wild is ever present and requires real discipline to prevent. Data deletion is one area where both corporations and individuals desperately need to learn more and become better. Data deleted is data still available Imagine that you work in finance and are in the midst of an audit. The auditors have asked you to provide a large amount of very sensitive financial reports. You decide you’d rather not email them for security reasons, so you head down to IT to see if you can borrow a flash drive. The IT staff happily obliges, loaning you a nice new 8GB flash drive. You copy all your reports onto the drive and hand it to the auditors, who take what they need and give it back to you. You then delete the data on the drive and, being a good person, return the drive to IT. What you’ve actually done is given anyone in IT or, worse, anyone to whom that flash drive is loaned next a copy of that incredibly sensitive data. How? Deleting a file on virtually any common file system simply removes an index entry from that file system’s file table. Deleting that entry does nothing to actually erase the data. Anyone who examines unallocated portions of the disk and can recognize the context of data can see what used to be on the disk. There are many free tools available that completely automate that process, so almost anyone can do it. How to securely delete data How might you avoid that potentially disastrous result? You have a few options. The least complicated way is to erase the files you want deleted and completely fill the rest of the disk with random, unimportant data. That physically overrwites that leftover “deleted” data. For example, you might delete your files and copy a bunch of MP3s onto the flash drive until it was full, then delete them as well. Anyone sleuthing around on the disk will find only those MP3s, not the important data you previously stored on it. But doing it manually is a pain and error-prone — especially if you’re talking about a large-capacity hard disk rather than a small-capacity flash drive. To ease that effort, there are tools such as Eraser (recommended by the Electronic Frontier Foundation) and the Gnu coreutils utility Shred that either write random data across unallocated portions of a disk (ensuring that data you’ve already deleted is obscured) or overwrite an existing file with random data and then delete it (effectively doing the same thing, but for a single file). However, it’s important to realize that even diligent use of a fairly thorough tool like Eraser or Shred may leave behind traces. This typically occurs in file systems that implement data journaling, caching, and snapshots. You usually won’t find these items on your average Windows PC, but it’s very common for network administrators to implement these features to aid in disaster recovery. For example, many admins use Microsoft Windows Server’s Volume Shadow Copy, which allows an administrator to specify a certain portion of a disk (typically one holding file shares) as a snapshot repository. The admin can then configure snapshots to be taken at various intervals throughout the day, so it’s very easy for users to restore a previous version of the file if accidentally deleted. If you use a secure deletion tool to kill that file, there is still a perfectly undamaged version sitting in the VSC snapshot waiting to be restored. Backup systems that work throughout the day, such as Apple’s Time Machine, also provide access to deleted files from the pre-deletion backups. Encryption can help, but it has limits What’s the best way to ensure that your data is never exposed? Encryption is almost always the best way to do it, but it has serious drawbacks. The best aspect of encryption is that the only thing you really need to do to ensure that properly encrypted data has been deleted is to delete (or forget) the key that encrypted it. From there, the effort to decrypt the data is so great that you needn’t worry about someone trying to sleuth it — unless you’re storing nuclear launch codes or design specs for the iPhone 8. The drawback to encryption: It’s much more difficult to share data with others. Not only must you manage encrpytion and decryption keys, you have to make sure they aren’t exposed to unauthorized parties. For many in IT, what I’ve just described is a “duh,” yet false deletion remains a real problem. Even if you know how to do it right, your users, your friends, and your family probably don’t. Educate them. The Electronic Frontier Foundation’s Surveillance Self-Defense Project is a great resource for all users. The reality is that many data breaches occur simply because the user didn’t understand that deleting a file does not actually delete the data. This article, “Deleting data is the critical step you’re likely doing wrong,” originally appeared at InfoWorld.com. Read more of Matt Prigge’s Information Overload blog and follow the latest developments in storage at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter. Data ManagementData and Information SecurityEncryption