Quantization reduces the precision of weights. Instead of 16 bits per value, you use 4 bits.
16-bit: 65536 possible values per weight 4-bit: 16 possible values per weight
This sounds like massive information loss, but careful quantization preserves most model quality. The key is choosing which 16 values best represent the original distribution.