Paul Krill
Editor at Large

Google API brings LLMs to Android and iOS devices

news
Mar 8, 20242 mins

Experimental MediaPipe LLM Inference API allows developers to run large language models ‘on-device’ across Android, iOS, and web platforms.

Large language models, LLMs
Credit: Phalexaviles/Shutterstock

Google has released an experimental API that allows large language models to run fully on-device across Android, iOS, and web platforms.

Introduced March 7, the MediaPipe LLM Inference API was designed to streamline on-device LLM integration for web developers, and supports web, Android, and iOS platforms. The API provides initial support for four LLMs: Gemma, Phi 2, Falcon, and Stable LM.

Google warns that the API is experimental and still under active development, but gives researchers and developers the ability to prototype and test openly available models on-device. For Android, Google noted that production applications with LLMs can use the Gemini API or Gemini Nano on-device through Android AICore, a system-level capability introduced in Android 14 that provides Gemini-powered solutions for high-end devices including integrations with accelerators, safety filters, and LoRA adapters.

Developers can try the MediaPipe LLM Inference API via a web demo or by building sample demo apps. An official sample is available on GitHub. The API allows developers to bring LLMs on device in a few steps, using platform-specific SDKs. Through significant optimizations, the API can deliver state-of-the-art latency on-device, focusing on the CPU and GPU to support multiple platforms, Google said. The company plans to expand the API to more platforms and models in the coming year.

Paul Krill

Paul Krill is editor at large at InfoWorld. Paul has been covering computer technology as a news and feature reporter for more than 35 years, including 30 years at InfoWorld. He has specialized in coverage of software development tools and technologies since the 1990s, and he continues to lead InfoWorld’s news coverage of software development platforms including Java and .NET and programming languages including JavaScript, TypeScript, PHP, Python, Ruby, Rust, and Go. Long trusted as a reporter who prioritizes accuracy, integrity, and the best interests of readers, Paul is sought out by technology companies and industry organizations who want to reach InfoWorld’s audience of software developers and other information technology professionals. Paul has won a “Best Technology News Coverage” award from IDG.

More from this author