Paul Krill
Editor at Large

Meta introduces Llama Stack distributions for building LLM apps

news
Sep 26, 20242 mins

Llama Stack distributions package multiple Llama Stack API providers to ease the end-to-end development of generative AI applications for developers.

Four Llamas on the range - LLMs
Credit: Noe Besso/Shutterstock

Looking to ease the development of generative AI applications, Meta is sharing its first official Llama Stack distributions, to simplify how developers work with Llama large language models (LLMs) in different environments.

Unveiled September 25, Llama Stack distributions package multiple Llama Stack API providers that work well together to provide a single endpoint for developers, Meta announced in a blog post. The Llama Stack defines building blocks for bringing generative AI applications to market. These building blocks span the development life cycle from model training and fine-tuning through to product evaluation and on to building and running AI agents and retrieval-augmented generation (RAG) applications in production. A repository for Llama Stack API specifications can be found on GitHub.

Meta also is building providers for the Llama Stack APIs. The company is looking to ensure that developers can assemble AI solutions using consistent, interlocking pieces across platforms. Llama Stack distributions are intended to enable developers to work with Llama models in multiple environments including on-prem, cloud, single-node, and on-device, Meta said. The Llama Stack consists of the following set of APIs:

  • Inference
  • Safety
  • Memory
  • Agentic System
  • Evaluation
  • Post Training
  • Synthetic Data Generation
  • Reward Scoring

Each API is a collection of REST endpoints. The introduction of the Llama Stack distributions is happening alongside Meta’s release of Llama 3.2, which includes small and medium-sized vision LLMs (11B and 90B) and lightweight, text-only models (1B and 3B) that fit onto edge and mobile devices.

Paul Krill

Paul Krill is editor at large at InfoWorld. Paul has been covering computer technology as a news and feature reporter for more than 35 years, including 30 years at InfoWorld. He has specialized in coverage of software development tools and technologies since the 1990s, and he continues to lead InfoWorld’s news coverage of software development platforms including Java and .NET and programming languages including JavaScript, TypeScript, PHP, Python, Ruby, Rust, and Go. Long trusted as a reporter who prioritizes accuracy, integrity, and the best interests of readers, Paul is sought out by technology companies and industry organizations who want to reach InfoWorld’s audience of software developers and other information technology professionals. Paul has won a “Best Technology News Coverage” award from IDG.

More from this author