Author: Adhithyan Balajee
Affiliation: ML Engineer, Stratforge
Email: adhithyanbalajee@gmail.com
Year: 2025
Large Language Models (LLMs) such as GPT, LLaMA, and Falcon have revolutionized NLP but remain computationally expensive to fine-tune.
This work presents a comprehensive comparative analysis of Parameter-Efficient Fine-Tuning (PEFT) techniques — LoRA, Adapters, and Prefix-Tuning — that drastically reduce resource usage by training only a small fraction of parameters while freezing the model backbone.
Our study demonstrates that PEFT methods achieve 98–99% of full fine-tuning accuracy with 3–5× faster training and up to 80% lower GPU memory usage on GLUE and CNN/DailyMail benchmarks.
The paper provides practical insights, architecture diagrams, and empirical comparisons for researchers and practitioners adapting LLMs on limited hardware.
- 🔹 Unified comparison of LoRA, Adapters, and Prefix-Tuning under a single experimental setup.
- 🔹 Demonstrated that LoRA achieves ≈0.8% trainable parameters with near-baseline accuracy.
- 🔹 Analysis of training time, GPU memory, and convergence rates.
- 🔹 Practical guidelines for applying PEFT in real-world NLP projects.
| Method | Trainable Parameters | Accuracy (GLUE) / ROUGE-L (CNN-DM) | Training Time | GPU Memory |
|---|---|---|---|---|
| Full Fine-Tuning | 100% | 92.1 / 41.6 | 12h | 48GB |
| LoRA | 0.8% | 91.8 / 41.3 | 3h | 12GB |
| Adapters | 3.0% | 91.6 / 41.0 | 4h | 14GB |
| Prefix-Tuning | 1.0% | 91.3 / 40.8 | 3.5h | 13GB |