You saw the headline numbers in Section : roughly experiments, additive improvements, % training speedup. Here's what matters more.
Karpathy tested whether those findings transferred. Every improvement discovered on a depth- model also improved a depth- model. These weren't tricks that only worked at one scale. The changes to training dynamics held up when the model got bigger. That's the strongest evidence that the loop produces real findings, not artifacts of a specific configuration.