Clustering organizes data within partitions for faster filtering:
Snowflake clustering:
ALTER TABLE orders CLUSTER BY (customer_id, order_date);
Queries filtering on clustered columns skip irrelevant micro-partitions.
BigQuery clustering: Define up to clustering columns. Order matters. First column most selective.
Delta Lake Z-ordering:
OPTIMIZE orders ZORDER BY (customer_id);
Clustering helps most on large tables with selective filters. Small tables don't benefit enough to justify the overhead.