Project 6 — Fine-Tuned Classifier¶
Fine-tune a small language model with LoRA for a text classification task and compare it against a zero-shot prompted baseline. This project shows when fine-tuning beats prompting — and when it doesn't.
What you'll build¶
A complete fine-tuning pipeline that:
- Prepares a labeled classification dataset in chat format
- Fine-tunes Qwen/Qwen2-0.5B-Instruct (a small, fast model) with QLoRA
- Evaluates the fine-tuned model vs. zero-shot gpt-4o-mini on held-out test data
- Merges and saves the adapter for inference
- Serves predictions via a FastAPI endpoint
Skills covered¶
| Skill | Where |
|---|---|
| Dataset preparation in chat format | 01-setup |
| QLoRA fine-tuning with SFTTrainer | 02-implementation |
| Label masking and gradient flow | 02-implementation |
| Merge and save LoRA adapter | 02-implementation |
| F1 comparison: fine-tuned vs zero-shot | 04-evaluation |
Prerequisites¶
- Week 02 Day 02 Part 1 — Fine-Tuning
- Week 02 Day 04 Part 2 — Deployment
Task: customer review sentiment classification¶
Labels: positive, negative, neutral
Input: product review text (< 200 words)
This is deliberately simple so you can focus on the fine-tuning pipeline, not the task.
Tech stack (requires GPU or Google Colab)¶
transformers==4.45.0
peft==0.13.0
trl==0.11.4
bitsandbytes==0.44.0
datasets==3.0.1
accelerate==1.0.1
torch==2.4.1
openai==1.51.0
fastapi==0.115.0
uvicorn==0.30.6
This project requires a GPU
QLoRA fine-tuning requires at least 8GB VRAM. Use Google Colab (free T4 GPU) or a cloud instance (Lambda Labs, RunPod). The inference endpoint can run on CPU.