Interview question: "Design logging for TB/day of logs."
Components:
- Collection: Agents (Fluentd, Filebeat) on each host
- Transport: Kafka for buffering and durability
- Processing: Stream processing for parsing, enrichment
- Storage: Object storage (S3) for cold, search index for hot
- Query: Full-text search (Elasticsearch) or columnar (ClickHouse)
Cost optimization:
- Sample verbose logs
- Tiered storage with lifecycle policies
- Index only searchable fields
Design approach:
Estimate volume. Plan retention tiers. Design for query patterns. Budget for storage and compute.