Standard LoRA uses α/r scaling. As rank increases, this scaling becomes suboptimal.
rsLoRA (Rank-Stabilized LoRA) uses α/√r instead. This scaling better maintains gradient magnitudes across different ranks.
The benefit: You can use higher ranks without tuning alpha. Results are more consistent when experimenting with different rank values.