Frontier manages URLs to crawl. Priority queue: important pages first (PageRank, freshness). Separate queues per domain for politeness.
Distributed frontier: partition URLs by domain hash across crawler nodes. Each node handles subset of domains. Deduplication: check URL against seen set (Bloom filter) before adding. Normalization: canonicalize URLs to avoid duplicates.