CDC captures row-level changes from source databases: inserts, updates, deletes.
Log-based CDC: Read database transaction logs (MySQL binlog, Postgres WAL). Low impact on source. Captures all changes.
Query-based CDC: Poll for changes using timestamps or version columns. Simpler but misses deletes and has higher source load.
Tools:
- Debezium: Open source, Kafka-native
- Fivetran: Managed, supports many sources
- AWS DMS: AWS-native solution
CDC enables near-real-time replication without full table scans. For large tables, it's the only practical approach.