Distributed file systems span multiple machines:
HDFS: Hadoop Distributed File System. Files split into blocks ( MB default). Blocks replicated across nodes.
Architecture: NameNode tracks metadata. DataNodes store actual blocks. Client reads from nearest replica.
Trade-offs:
- Write once, read many. Not good for random writes.
- High latency for small files.
- Great for batch processing large datasets.
Modern alternatives: Cloud object storage (S3) often replaces HDFS. Lower cost, managed, but different access patterns.