Fine-tuning isn't free. Expect:
- QLoRA on B: - hours on RTX 4090, ~- cloud
- LoRA on B: - hours on A100, ~- cloud
- Full fine-tune B: + hours on x A100, ~+ cloud
Budget -x your first estimate for data prep and iteration.
Fine-tuned models also require maintenance. Base models update. Your data evolves. Performance degrades. RAG systems update easily; fine-tuned models need full retraining. Consider if RAG solves your problem with less ongoing cost.