Here is the idea: if you split the tree into chains, any path crosses at most chains. Each chain is a sequence of ancestor-descendant nodes. You can represent each chain as a contiguous segment. To query a path from to , you query segments (one per chain the path crosses) and combine the results.
The magic is in choosing which edges form chains so that paths cross few chains. If you pick chains randomly, a path might cross chains. The heavy-light edge rule guarantees crossings.
Space complexity is for the data structures used.