Serdar Yegulalp
Senior Writer

Google’s machine learning cloud pipeline explained

feature
May 19, 20174 mins

You’ll be dependent on TensorFlow to get the full advantage, but you’ll gain a true end-to-end engine for machine learning

machine learning head wireframe
Credit: geralt

When Google first told the world about its Tensor Processing Unit, the strategy behind it seemed clear enough: Speed machine learning at scale by throwing custom hardware at the problem. Use commodity GPUs to train machine-learning models; use custom TPUs to deploy those trained models.

The new generation of Google’s TPUs is designed to handle both of those duties, training and deploying, on the same chip. That new generation is also faster, both on its own and when scaled out with others in what’s called a “TPU pod.”

But faster machine learning isn’t the only benefit from such a design. The TPU, especially in this new form, constitutes another piece of what amounts to Google building an end-to-end machine-learning pipeline, covering everything from intake of data to deployment of the trained model.

Machine learning: A pipeline runs through it

One of the largest obstacles to using machine learning right now is how tough it can be to put together a full pipeline for the data—intake, normalization, model training, model and deployment. The pieces are still highly disparate and uncoordinated. Companies like Baidu have hinted at wanting to create a single, unified, unpack-and-go solution, but so far that’s just a notion.

The most likely place for such a solution to emerge is in the cloud. As time goes by, much more of the data collected for machine learning (and everything else, really) lives there by default. So does the hardware needed to produce actionable results from it. Give people a single end-to-end, in-the-cloud workflow for machine learning, one with only a few knobs on it by default, and they’ll be happy to build on top of it.

Already mostly realized, Google’s vision is that each phase of the pipeline can be executed in the cloud, as close as possible to the data, for the best possible speed. With TPUs, Google’s also seeks to provide many of the phases with custom hardware acceleration that can be scaled out on demand.

The new TPUs are meant to boost pipeline acceleration in several ways. One speedup comes from being able to gang multiple TPUs. Another comes from being able to train and deploy models from the same slab of silicon. With the latter, it’s easier to incrementally retrain models as new data comes in, because the data doesn’t have to be moved around as much.

That optimization—operating on data where it is to speed up operations on it—is also right in line with other machine learning performance improvements in the works, such as some proposed Linux kernel fixes and common APIs for machine learning data access.

But are you willing to lock yourself into TensorFlow?

There’s one possible downside to Google’s vision: that the performance boost provided by TPUs works only if you use the right kind of machine-learning framework with it. And that means Google’s own TensorFlow.

It’s not that TensorFlow is a bad framework; in fact, it’s quite good. But it’s only one framework of many, each suited to different needs and use cases. So TPUs’ limitation of supporting just TensorFlow means you have to use it, regardless of its fit, if you want to squeeze maximum performance out of Google’s ML cloud. Another framework might be more convenient to use for a particular job, but it might not train or serve predictions as quickly because it’ll be consigned to running only on GPUs.

None of this also rules out the possibility that Google could introduce other hardware, such as customer-reprogrammable FPGAs, to allow frameworks not directly sponsored by Google to also have an edge.

But for most people, the inconvenience of being able to use TPUs to accelerate only certain things will be far outweighed by the convenience of having a managed, cloud-based everything-in-one-place pipeline for machine-learning work. So, like it or not, prepare to use TensorFlow.

Serdar Yegulalp

Serdar Yegulalp is a senior writer at InfoWorld. A veteran technology journalist, Serdar has been writing about computers, operating systems, databases, programming, and other information technology topics for 30 years. Before joining InfoWorld in 2013, Serdar wrote for Windows Magazine, InformationWeek, Byte, and a slew of other publications. At InfoWorld, Serdar has covered software development, devops, containerization, machine learning, and artificial intelligence, winning several B2B journalism awards including a 2024 Neal Award and a 2025 Azbee Award for best instructional content and best how-to article, respectively. He currently focuses on software development tools and technologies and major programming languages including Python, Rust, Go, Zig, and Wasm. Tune into his weekly Dev with Serdar videos for programming tips and techniques and close looks at programming libraries and tools.

More from this author