QLoRA combines quantization with LoRA for even greater memory savings. The base model is loaded in -bit precision instead of -bit.
A B model that normally needs GB just to load now fits in ~GB. Add LoRA training overhead and you can fine-tune on a -GB GPU. Consumer hardware becomes viable.