Messages can be delivered more than once. Your consumers must handle it:
Why duplicates?
- Consumer crashes after processing, before acknowledging
- Network issues during acknowledgment
- At-least-once delivery guarantee
Solutions:
- Idempotent operations: Processing twice has same effect
- Deduplication table: Store processed message IDs
- Exactly-once semantics: Kafka transactions, but complex
if message_id in processed_ids:
skip
else:
process(message)
processed_ids.add(message_id)
acknowledge()