With Verizon's aid, police arrest a man for storing illegal porn in the cloud, which raises questions about how much privacy cloud users can expect Think the data you upload to a cloud storage site is private? Not necessarily. At least a dozen of the largest ISPs in the United States routinely scan stored files for alleged child pornography. When they find it, they’re obligated by federal law to blow the whistle.Child pornography is, of course, a repugnant and illegal practice. But the case of a Maryland church deacon arrested on March 1 for allegedly possessing pornographic pictures and videos of children raises questions about how much privacy cloud storage users can expect. It also illustrates the increasing sophistication of software that analyzes video and its debilitating effect on privacy, a topic I touched on a few weeks ago.[ Nowhere to hide: Video location tech has arrived | Stay ahead of the key tech business news with InfoWorld’s Today’s Headlines: First Look newsletter. | Get a digest of the day’s top tech stories in the InfoWorld Daily newsletter. ] What’s more, laws pertaining to cloud storage are so new and so vague that it isn’t even clear the data you upload to a storage site is still yours. It sounds crazy, but that’s exactly the logic the U.S. government used when it shut down MegaUpload’s service and denied innocent users access to their own property, according to a court brief filed by the Electronic Frontier Foundation.How ISPs scan content When Baltimore County police served a search warrant at the home of 67-year-old William Steven Albaugh, they recovered numerous files allegedly containing graphic images and videos of young children being subjected to sexual abuse. The police found the material on his home computer and a number of USB drives, but how did they know it was there in the first place?Albaugh is a subscriber to Verizon’s high-speed Internet service and uses it to back up his data. If he had read the company’s terms of service, which few people bother to do, he would have known that Verizon “shall have the right, but not the obligation, to monitor use of the of, and to screen, refuse, move or remove any content transmitted to or from, any Additional Service for compliance with law or the terms of this Agreement.” But he would not have known that Verizon’s storage partner, Colorado-based Digi-Data, routinely scans stored files with a powerful software tool invented by Microsoft called PhotoDNA. When the storage provider finds files that may fit the definition of child pornography, it notifies Verizon, which in turn notifies the National Center for Missing and Exploited Children.In Albaugh’s case, NCMEC examined the files, deemed them illegal, and notified the police, who then obtained a search warrant and raided his home. He was arrested on a felony charge that could earn him up to 25 years in prison. He’s currently free on $75,000 bail, said Baltimore County Police spokeswoman Cathy Batton. (You can read what Albaugh told police in this article in the Baltimore Sun.)Turning those files over to NCMEC wasn’t just a public-spirited action on Verizon’s part; service providers who spot child pornography are required by the federal PROTECT Our Children Act of 2008 to send those files to NCMEC. On the other hand, service providers are not required to proactively search for pornography, but about a dozen do so anyway, said John Shehan, executive director NCMEC’s exploited child division. He said “about a dozen” because like much of the information about this program, it is kept quiet and he won’t name the companies that signed a 2009 agreement to use PhotoDNA and notify NCMEC of the results. (Microsoft and Facebook have said publicly that they are using the application to hunt for child pornography.) Verizon won’t even confirm that its storage partner is Digi-Data, but that information is contained in the terms of service.Verizon spokeswoman Linda Laughlin told me that Verizon “never looks at customer’s data or opens those files.” She does, though, confirm that its storage partner scans uploaded files and passes on suspicious files to NCMEC. “We do exactly what the law requires,” she said.According to Shehan, some providers actually open the files manually after the software scores a hit; others don’t. To date, more than 8 million images of suspected child pornography have been uploaded to NCMEC’s tip line. When they get there, images are reviewed by NCMEC staffers. Before turning photos over to the authorities, the staffers must determine that the files contain images of prepubescent children or infants and depict actual sexual abuse, he said. A shot of your infant in the bathtub wouldn’t qualify.How Microsoft software fingerprints photos PhotoDNA, which Microsoft donates to law enforcement agencies, uses technology that’s somewhat similar to facial recognition. It examines the digital information comprising a photo and creates a “hash,” essentially a fingerprint of the photo. It can then match the hash to other copies of the same image.Older technologies used hashing to identify photos, but they were not very robust. Even after a relatively small alteration, such as resizing or saving the image in a different format, the photo would be impossible to match. PhotoDNA does not have that limitation. The newer technology is called “robust hashing.” Because the hashes are relatively small, NCMEC doesn’t need to store millions of photos on its servers. When it does identify a pornographic image, its hash is then added to a database the service providers work with when they scan photos uploaded by users. There is now data identifying some 16,000 images in the database. It’s not clear to me if encrypting his files would have kept Albaugh out of trouble, but it is certain that if he had uploaded images whose hashes were not in the database, he would not have been caught.If Albaugh has done what the police say he’s done, I certainly have no sympathy. The larger question, though, is twofold: Can we be confident that what we store in the cloud is secure from prying eyes? And if providers are scanning for child pornography, are they or the government scanning our data for other forms of content someone might find objectionable?I welcome your comments, tips, and suggestions. Post them here (Add a comment) so that all our readers can share them, or reach me at bill@billsnyder.biz. Follow me on Twitter at BSnyderSF. This article, “When is your data not your data? When it’s in the cloud,” was originally published by InfoWorld.com. Read more of Bill Snyder’s Tech’s Bottom Line blog and follow the latest technology business developments at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter. Cloud StorageCareersTechnology IndustryEncryptionPrivacyCloud Security