Lineage tracks data from source to destination.
Upstream lineage: Where did this table's data come from? Which source systems, transformations, intermediate tables?
Downstream lineage: What depends on this table? Which dashboards, ML models, exports break if it fails?
Use cases:
- Impact analysis before schema changes
- Root cause analysis during incidents
- Compliance audits (where did this PII go?)
- Debugging data quality issues
dbt generates lineage automatically. Tools like Atlan and DataHub provide lineage across your entire stack.