Core Architecture
Installation
Gradle Dependencies
Recommended: Use version catalog for dependency management.Required Permissions
Add toAndroidManifest.xml:
Runtime Permissions (Android 13+)
Request notification permission before downloading:Loading Models
Method 1: Automatic Download and Load (Recommended)
The simplest approach - specify model name and quantization, SDK handles download and loading:Method 2: Download Without Loading
Separate download from loading for better control:- Pre-download models during app onboarding
- Download on Wi-Fi, load later on mobile data
- Manage storage before loading heavy models
Method 3: Cross-Platform LeapDownloader
For Kotlin Multiplatform projects (iOS, macOS, JVM, Android):LeapDownloader doesnβt provide Android-specific features like notifications or WorkManager integration. Use LeapModelDownloader for better UX on Android.
Model Download Management
Query download status, check available storage, and manage cached models:Check Download Status
Get Model Information
Remove Downloaded Models
Cancel Ongoing Download
Check Available Storage
Complete Download Management Example
Download Status Types
Core Classes
ModelRunner
Represents a loaded model instance. Thread-safe. Methods:createConversation(systemPrompt: String? = null): Conversation- Start new chatcreateConversationFromHistory(history: List<ChatMessage>): Conversation- Restore chatsuspend fun unload()- Free memory (MUST call in onCleared with runBlocking)
Conversation
Manages chat history and generation state. Fields:history: List<ChatMessage>- Full message history (copy, immutable)isGenerating: Boolean- Thread-safe generation status
generateResponse(userTextMessage: String, options: GenerationOptions? = null): Flow<MessageResponse>generateResponse(message: ChatMessage, options: GenerationOptions? = null): Flow<MessageResponse>registerFunction(function: LeapFunction)- Add tool for function callingappendToHistory(message: ChatMessage)- Add message without generating
ChatMessage
ChatMessageContent (Sealed Class)
- Format: WAV (RIFF) - No MP3/AAC/OGG
- Sample Rate: 16 kHz recommended (auto-resampled if different)
- Channels: Mono (1 channel) REQUIRED - stereo rejected
- Encoding: Float32, Int16, Int24, or Int32 PCM
MessageResponse (Sealed Interface)
Streaming generation responses:Generation Pattern (REQUIRED)
Generation Options
Structured Output (Constrained Generation)
Function Calling (Tool Use)
Multimodal Input
Vision (Image + Text)
Audio Input (Speech Recognition)
Audio Output (Text-to-Speech)
Model Selection Guide
Text Models
- LFM2.5-1.2B-Instruct: General purpose (recommended)
- LFM2.5-1.2B-Thinking: Extended reasoning (emits ReasoningChunk)
- LFM2-1.2B: Stable version
- LFM2-1.2B-Tool: Optimized for function calling
Multimodal Models
- LFM2.5-VL-1.6B: Vision + text
- LFM2.5-Audio-1.5B: Audio + text (TTS, ASR, voice chat)
Quantization (Speed vs Quality)
- Q4_0: Fastest, smallest (lowest quality)
- Q4_K_M: Recommended (good balance)
- Q5_K_M: Better quality
- Q6_K: High quality
- Q8_0: Near-original quality
- F16: Original quality (largest, slowest)
Error Handling
Critical Best Practices
1. Model Unloading (REQUIRED)
runBlockingblocks the main thread during ViewModel cleanup- If model unload takes >5 seconds, you get an ANR (Application Not Responding)
- Using
CoroutineScope(Dispatchers.IO).launchmakes cleanup async - Always catch exceptions to prevent crashes during cleanup
2. Generation Cancellation
3. Thread Safety
- All SDK operations are main-thread safe
- Use
viewModelScope.launchfor all suspend functions - Callbacks run on main thread
4. History Management
5. Serialization
Complete ViewModel Example
Imports Reference
Android (LeapModelDownloader)
Cross-Platform (LeapDownloader)
Troubleshooting
Model wonβt load
- Check internet connection (first download)
- Verify minSdk = 31 in build.gradle.kts
- Use physical device (emulators may crash)
- Check storage space (models: 500MB-2GB)
Generation fails
- Check prompt length vs context window
- Verify model supports feature (e.g., vision, audio, function calling)
- Check
isGeneratingbefore new generation
Audio input fails
- Verify WAV format (not MP3/AAC)
- Ensure mono channel (stereo rejected)
- Check sample rate (16kHz recommended)
Memory issues
- Call
modelRunner?.unload()in onCleared - Donβt load multiple models simultaneously
- Use appropriate quantization (Q4_K_M recommended)