How you initialize A and B matrices matters:
- Matrix A: Usually initialized with small random values (Gaussian)
- Matrix B: Usually initialized to zeros
Zero-initializing B means the LoRA contribution starts at zero. The model behaves exactly like the original at the start of training. This provides a stable starting point.