View Source Code
Browse the complete example on GitHub
This is a practical example of building local AI tools and apps with
- No cloud costs
- No network latency
- No data privacy loss
Whatβs inside?
In this example, you will learn how to:- Set up local AI inference using llama.cpp to run Liquid models entirely on your machine without requiring cloud services or API keys
- Build a file monitoring system that automatically processes new files dropped into a directory
- Extract structured output from images using LFM2.5-VL-1.6B, a small vision-language model.
Environment setup
You will need- llama.cpp to serve the Language Models locally.
- uv to manage Python dependencies and run the application efficiently without creating virtual environments manually.
Install llama.cpp
Click to see installation instructions for your platform
Click to see installation instructions for your platform
llama-server is available:
Install UV
Click to see installation instructions for your platform
Click to see installation instructions for your platform
macOS/Linux:Windows:
How to run it?
Letβs start by cloning the repository:llama-server for you β no need to run it separately.
Watch mode
Run it as a background service that continuously monitors a directory and automatically parses invoice images as they land in the folder:Process mode
Process specific files or folders and exit:If you have
make installed, you can run the application with the following commands:Results
You can run the tool with a sample of images underinvoices/ with
| File | Utility | Amount | Currency |
|---|---|---|---|
| water_australia.png | water | 68.46 | AUD |
| Sample-electric-Bill-2023.jpg | electricity | 28.32 | USD |
| castlewater1.png | water | 436.55 | GBP |
| british_gas.png | electricity | 81.31 | GBP |
Next steps
The model works perfectly out-of-the-box on our sample of invoices. However, depending on your specific invoice formats and layouts, you may encounter cases where the extraction is not accurate enough. In those cases, you can fine-tune the model on your own dataset to improve accuracy.Fine-tune Vision Language Models
Learn how to fine-tune Vision Language Models on your own dataset to improve extraction accuracy.