Static batching waits until a batch is full, then processes it together. Continuous batching is smarter: new requests join mid-batch as others complete.
When request A finishes at token and request B is at token , continuous batching fills A's slot immediately with a new request. No waiting.
vLLM and TGI both use continuous batching. You get higher throughput without sacrificing latency. Production LLM serving requires this.