Skip to main content
Use Ollama for quick local model serving with a simple CLI or Docker-based deployment.
Ollama uses GGUF models and supports GPU acceleration (CUDA, Metal, ROCm).
LFM2 and LFM2.5 VL models currently do not work with Ollama. Our team has opened this PR, that once merged will solve the issue. Do not attempt to use Ollama with LFM2 and LFM2.5 VL models until this PR gets merged into ollama:main branch.

Installation

Download directly from ollama.com/download.

Using LFM2 Models

Ollama can load GGUF models directly from Hugging Face or from local files.

Running GGUFs

You can run LFM2 models directly from Hugging Face:
ollama run hf.co/LiquidAI/LFM2.5-1.2B-Instruct-GGUF
See the Models page for all available GGUF repositories. To use a local GGUF file, first download a model from Hugging Face:
uv pip install huggingface-hub
hf download LiquidAI/LFM2.5-1.2B-Instruct-GGUF {quantization}.gguf --local-dir .
Replace {quantization} with your preferred quantization level (e.g., q4_k_m, q8_0). Then run the local model:
ollama run /path/to/model.gguf
For custom configurations (specific quantization, chat template, or parameters), create a Modelfile.Create a plain text file named Modelfile (no extension) with the following content:
FROM /path/to/model.gguf

TEMPLATE """<|startoftext|><|im_start|>system
{{ .System }}<|im_end|>
<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
"""

PARAMETER temperature 0.7
PARAMETER top_p 0.9
PARAMETER stop "<|im_end|>"
PARAMETER stop "<|endoftext|>"
Import the model with the Modelfile:
ollama create my-model -f Modelfile
Then run it:
ollama run my-model

Basic Usage

Interact with models through the command-line interface.

Interactive Chat

ollama run hf.co/LiquidAI/LFM2.5-1.2B-Instruct-GGUF
Type your messages and press Enter. Use /bye to exit.

Single Prompt

ollama run hf.co/LiquidAI/LFM2.5-1.2B-Instruct-GGUF "What is machine learning?"
If you imported a model with a custom name using a Modelfile, use that name instead (e.g., ollama run my-model).

Serving Models

Ollama automatically starts a server on http://localhost:11434 with an OpenAI-compatible API for programmatic access.

Python Client

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:11434/v1",
    api_key="not-needed"
)

response = client.chat.completions.create(
    model="hf.co/LiquidAI/LFM2.5-1.2B-Instruct-GGUF",
    messages=[
        {"role": "user", "content": "Explain quantum computing."}
    ],
    temperature=0.7
)
print(response.choices[0].message.content)
Ollama provides two native API endpoints:Generate API (simple completion):
curl http://localhost:11434/api/generate -d '{
  "model": "hf.co/LiquidAI/LFM2-1.2B-GGUF",
  "prompt": "What is artificial intelligence?"
}'
Chat API (conversational format):
curl http://localhost:11434/api/chat -d '{
  "model": "hf.co/LiquidAI/LFM2-1.2B-GGUF",
  "messages": [
    {"role": "user", "content": "Hello!"}
  ]
}'

Vision Models

LFM2-VL GGUF models can also be used for multimodal inference with Ollama.
Run a vision model directly and provide images in the chat:
ollama run hf.co/LiquidAI/LFM2.5-VL-1.6B-GGUF
In the interactive chat, you can ask questions about images using the /image command followed by the file path:
>>> /image path/to/image.jpg What's in this image?
Or provide the image path directly in your prompt:
>>> Describe the contents of ~/Downloads/photo.png
from openai import OpenAI
import base64

client = OpenAI(
    base_url="http://localhost:11434/v1",
    api_key="not-needed"
)

# Encode image to base64
with open("image.jpg", "rb") as image_file:
    image_data = base64.b64encode(image_file.read()).decode("utf-8")

response = client.chat.completions.create(
    model="hf.co/LiquidAI/LFM2.5-VL-1.6B-GGUF",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{image_data}"}},
                {"type": "text", "text": "What's in this image?"}
            ]
        }
    ]
)
print(response.choices[0].message.content)

Model Management

List installed models:
ollama list
Remove a model:
ollama rm hf.co/LiquidAI/LFM2.5-1.2B-Instruct-GGUF
Show model information:
ollama show hf.co/LiquidAI/LFM2.5-1.2B-Instruct-GGUF