Word embeddings map words to dense vectors where similar words are nearby.
Word2Vec: Train on word co-occurrence. Two architectures: Skip-gram (predict context from word) and CBOW (predict word from context).
GloVe: Global Vectors. Combines co-occurrence statistics with matrix factorization.
Key property: Vector arithmetic captures relationships. king - man + woman ≈ queen.
Limitations: One vector per word. No context sensitivity ("bank" means the same everywhere).
Interview tip: Know that BERT/GPT produce contextual embeddings, solving the polysemy problem.