Content Addressed Storage, EMC touch off a debate: How to divvy up data storage? EARLIER THIS YEAR, EMC introduced a new product aimed at a market they labeled content addressed storage. Ostensibly, this type of storage is a repository for data that needs to be archived but also needs to be handy for occasional access. Data that fits that profile includes electronic documents, medical images (MRIs, X-rays, etc.), archived e-mail, check images, broadcast content, satellite images — things that users don’t access frequently but should nevertheless be available quickly when needed. Does a market exist for storage systems specific to these documents? Is an entirely new storage architecture required to store this type of data? Mario and Scott, your two faithful Virgils in storage hell, have only mixed answers to offer. Let’s start with the specs of the EMC system. Obviously, that kind of data calls for large repositories, but not necessarily lightning-fast access: FC (Fibre Channel) or even SCSI disks would be overkill. Appropriately, EMC’s brainchild Centera is filled with several cheap ATA drives that give a customer up to 19.2TB per cabinet. Sixteen cabinets clustered together give 307.2TB, and seven clusters in a domain give you just over 2PB of capacity. On the software side, with CentraStor, EMC uses a proprietary content-addressing scheme that assigns a unique address to the item, one that stays with that content permanently. Let’s clarify. Using Centera, your file will be independently parked somewhere, and your application will only hold a receipt (the content address) bearing a unique ID. No other information is needed. As in valet parking, to retrieve your file, just hand the receipt to the attendant. The Centera attendant can detect data corruptions by running an algorithm on the content and comparing the results with the content address receipt. That’s clean and efficient, we must say. But here is our complaint: SANs and NAS are already two separate storage architectures that will merge over time. Centera’s proposal involves a whole new architecture that must be managed separately. Does that make sense? Additionally, content of this type is most likely already stored elsewhere. That means you’d have to migrate all your existing records to a separate system. And forget getting rid of your tape libraries. Although Centera offers centralized and secure storage for long-lived records, you’re still going to have to archive many of them. That means using tape or optical to get them to an off-site location. So what is the real advantage of Centera? We’re not sure yet. The alternative, of course, is using what you already have — your high performance SAN, plus optical disks and tape libraries. But isn’t that overkill? Or is it too little, too late? Software DevelopmentDatabasesTechnology IndustrySmall and Medium Business