datarekha

What is LoRA and how does it make fine-tuning parameter-efficient?

The short answer

LoRA freezes the pretrained weights and injects small trainable low-rank matrices into selected layers, learning the weight update as their low-rank product. This trains a tiny fraction of parameters, slashing memory and storage while approximating full fine-tuning, and the adapters can be merged back at inference.

How to think about it

LoRA freezes the pretrained weights and injects small trainable low-rank matrices into selected layers, learning the weight update as their low-rank product. This trains a tiny fraction of parameters, slashing memory and storage while approximating full fine-tuning, and the adapters can be merged back at inference.

Learn it properly LoRA & QLoRA fine-tuning

Keep practising

All NLP & LLMs questions

Explore further

Skip to content