Unsloth

Use Unsloth for faster training with optimized kernels, reduced memory usage, and built-in quantization support.

LFM2.5 models are fully supported by Unsloth. For comprehensive guides and tutorials, see the official Unsloth LFM2.5 documentation. Different training methods require specific dataset formats. See Finetuning Datasets for format requirements for SFT and GRPO.

Notebooks

Get started quickly with these ready-to-run Colab notebooks:

SFT with LoRA

Supervised fine-tuning with parameter-efficient LoRA adapters.

GRPO with LoRA

Reinforcement learning with Group Relative Policy Optimization.

CPT Text Completion

Continued pre-training for text completion tasks.

CPT Translation

Continued pre-training for translation tasks.

Quick Start

from unsloth import FastLanguageModel
from trl import SFTTrainer, SFTConfig

# Load model with Unsloth optimizations
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="LiquidAI/LFM2.5-1.2B-Instruct",
    max_seq_length=2048,
    load_in_4bit=True,  # Enable QLoRA for memory efficiency
)

# Apply LoRA with Unsloth's optimized gradient checkpointing
model = FastLanguageModel.get_peft_model(
    model,
    r=16,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
    lora_alpha=32,
    use_gradient_checkpointing="unsloth",  # 2x faster than default
)

# Train with TRL's SFTTrainer
trainer = SFTTrainer(
    model=model,
    args=SFTConfig(output_dir="./lfm2-unsloth", num_train_epochs=1, bf16=True),
    train_dataset=dataset["train"],
    tokenizer=tokenizer,
)
trainer.train()

# Fast inference
FastLanguageModel.for_inference(model)

Key Features

load_in_4bit=True: Enable QLoRA to reduce memory by ~4x with minimal quality loss
use_gradient_checkpointing="unsloth": Optimized checkpointing that’s 2x faster than default
FastLanguageModel.for_inference(): Switch to optimized inference mode after training

Tips

max_seq_length: Set to your expected maximum sequence length; Unsloth pre-allocates memory for efficiency
Target modules: Include MLP layers (gate_proj, up_proj, down_proj) for better quality on smaller models
Batch size: Unsloth’s optimizations allow larger batch sizes; experiment to maximize GPU utilization

Get Started

Models

Key Concepts

Inference

Fine-tuning

Help

Notebooks

SFT with LoRA

GRPO with LoRA

CPT Text Completion

CPT Translation

Quick Start

Key Features

Tips

Resources

Get Started

Models

Key Concepts

Inference

Fine-tuning

Help

​Notebooks

SFT with LoRA

GRPO with LoRA

CPT Text Completion

CPT Translation

​Quick Start

​Key Features

​Tips

​Resources

Notebooks

Quick Start

Key Features

Tips

Resources