Paul Krill
Editor at Large

Google releases differential privacy pipeline for Python

news
Jan 31, 20222 mins

PipelineDP allows datasets containing personal information to be aggregated in a way that preserves the privacy of individuals.

Private file card drawer
Credit: Thinkstock

Google is extending differential privacy capabilities to the Python language, with an open source tool, called PipelineDP, for creating pipelines that aggregate data containing personal information in a way that preserves the privacy of individuals. The tool allows data engineers to visualize and tune parameters used to produce differentially private information.

PipelineDP, developed in partnership with OpenMined and accessible from the project website, is still in an experimental stage. With differential privacy, useful insights and services can be provided without revealing any information about individuals. PipelineDP follows the 2019 launch of an open source version of Google’s foundational differential privacy library, which works with the  C++, Go, and Java languages.

Developers, researchers, and companies can use the new Python library to build applications with privacy technology that enables them to gain insights and observe trends from datasets while protecting and respecting individual privacy, Google said. PipelineDP can be used with the Apache Spark and Apache Beam frameworks for data processing. It already has enabled users to begin experimenting with new use cases, such as showing a website’s most-visited pages on a per country basis in an aggregated, anonymized way.

Google also is releasing a differential privacy tool to allow practitioners to visualize and tune parameters used to produce differentially private information. In addition, Google researchers have published a paper that shares techniques for scaling differential privacy to datasets of a petabyte or more.

Paul Krill

Paul Krill is editor at large at InfoWorld. Paul has been covering computer technology as a news and feature reporter for more than 35 years, including 30 years at InfoWorld. He has specialized in coverage of software development tools and technologies since the 1990s, and he continues to lead InfoWorld’s news coverage of software development platforms including Java and .NET and programming languages including JavaScript, TypeScript, PHP, Python, Ruby, Rust, and Go. Long trusted as a reporter who prioritizes accuracy, integrity, and the best interests of readers, Paul is sought out by technology companies and industry organizations who want to reach InfoWorld’s audience of software developers and other information technology professionals. Paul has won a “Best Technology News Coverage” award from IDG.

More from this author