Choose based on latency requirements and data characteristics.
Batch processing:
- Process large volumes at intervals
- Higher latency acceptable (hours)
- Simpler error handling (rerun batch)
- Use case: Daily reports, ML training
Stream processing:
- Process data as it arrives
- Low latency required (seconds)
- Handle late and out-of-order data
- Use case: Fraud detection, real-time dashboards
Tools: Batch (Spark, Hadoop), Stream (Kafka Streams, Flink). Flink unifies both.