Skip to main content
Use Unsloth for faster training with optimized kernels, reduced memory usage, and built-in quantization support.
LFM2.5 models are fully supported by Unsloth. For comprehensive guides and tutorials, see the official Unsloth LFM2.5 documentation. Different training methods require specific dataset formats. See Finetuning Datasets for format requirements for SFT and GRPO.

Notebooks

Get started quickly with these ready-to-run Colab notebooks:

Quick Start

from unsloth import FastLanguageModel
from trl import SFTTrainer, SFTConfig

# Load model with Unsloth optimizations
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="LiquidAI/LFM2.5-1.2B-Instruct",
    max_seq_length=2048,
    load_in_4bit=True,  # Enable QLoRA for memory efficiency
)

# Apply LoRA with Unsloth's optimized gradient checkpointing
model = FastLanguageModel.get_peft_model(
    model,
    r=16,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
    lora_alpha=32,
    use_gradient_checkpointing="unsloth",  # 2x faster than default
)

# Train with TRL's SFTTrainer
trainer = SFTTrainer(
    model=model,
    args=SFTConfig(output_dir="./lfm2-unsloth", num_train_epochs=1, bf16=True),
    train_dataset=dataset["train"],
    tokenizer=tokenizer,
)
trainer.train()

# Fast inference
FastLanguageModel.for_inference(model)

Key Features

  • load_in_4bit=True: Enable QLoRA to reduce memory by ~4x with minimal quality loss
  • use_gradient_checkpointing="unsloth": Optimized checkpointing that’s 2x faster than default
  • FastLanguageModel.for_inference(): Switch to optimized inference mode after training

Tips

  • max_seq_length: Set to your expected maximum sequence length; Unsloth pre-allocates memory for efficiency
  • Target modules: Include MLP layers (gate_proj, up_proj, down_proj) for better quality on smaller models
  • Batch size: Unsloth’s optimizations allow larger batch sizes; experiment to maximize GPU utilization

Resources