A transformer block combines all components:
Multi-head self-attention
Residual connection + layer norm
Feed-forward network
Residual connection + layer norm
LLMs stack many of these blocks. Llama B has blocks. Llama B has blocks. Each block refines the representation, building increasingly abstract features.