google-ai-edge/LiteRT-LM
LiteRT-LM
LiteRT-LM is Google’s production-ready, high-performance, open-source inference framework for deploying Large Language Models on edge devices.
π Product Website
π₯ What’s New: Gemma 4 support with LiteRT-LM
Deploy Gemma 4 across a broad range of hardware with stellar performance (blog).
π Try on Linux, macOS, Windows (WSL) or Raspberry Pi with the LiteRT-LM CLI:
|
|
π Key Features
- π± Cross-Platform Support: Android, iOS, Web, Desktop, and IoT (e.g. Raspberry Pi).
- π Hardware Acceleration: Peak performance via GPU and NPU accelerators.
- ποΈ Multi-Modality: Support for vision and audio inputs.
- π§ Tool Use: Function calling support for agentic workflows.
- π Broad Model Support: Gemma, Llama, Phi-4, Qwen, and more.
π Production-Ready for Google’s Products
LiteRT-LM powers on-device GenAI experiences in Chrome, Chromebook Plus, Pixel Watch, and more.
You can also try the Google AI Edge Gallery app to run models immediately on your device.
π° Blogs & Announcements
| Link | Description |
|---|---|
| Bring state-of-the-art agentic skills to the edge with Gemma 4 | Deploy Gemma 4 in-app and across a broader range of devices with stellar performance and broad reach using LiteRT-LM. |
| On-device GenAI in Chrome, Chromebook Plus and Pixel Watch | Deploy language models on wearables and browser-based platforms using LiteRT-LM at scale. |
| On-device Function Calling in Google AI Edge Gallery | Explore how to fine-tune FunctionGemma and enable function calling capabilities powered by LiteRT-LM Tool Use APIs. |
| Google AI Edge small language models, multimodality, and function calling | Latest insights on RAG, multimodality, and function calling for edge language models. |
π Quick Start
π Key Links
- π Technical Overview including performance benchmarks, model support, and more.
- π LiteRT-LM CLI Guide including installation, getting started, and advanced usage.
β‘ Quick Try (No Code)
Try LiteRT-LM immediately from your terminal without writing a single line of code using uv:
|
|
π Supported Language APIs
Ready to get started? Explore our language-specific guides and setup instructions.
| Language | Status | Best For… | Documentation |
|---|---|---|---|
| Kotlin | β Stable | Android apps & JVM | Android (Kotlin) Guide |
| Python | β Stable | Prototyping & Scripting | Python Guide |
| C++ | β Stable | High-performance native | C++ Guide |
| Swift | π In Dev | Native iOS & macOS | (Coming Soon) |
ποΈ Build From Source
This guide shows how you can
compile LiteRT-LM from source. If you want to build the program from source,
you should checkout the stable tag.
π¦ Releases
- v0.10.1: Deploy Gemma 4 with stellar performance (blog) and introduce LiteRT-LM CLI.
- v0.9.0: Improvements to function calling capabilities, better app performance stability.
- v0.8.0: Desktop GPU support and Multi-Modality.
- v0.7.0: NPU acceleration for Gemma models.
For a full list of releases, see GitHub Releases.
