Before continuing, think about both approaches:
Two BFS approach: Start from any node, BFS to find the farthest node x.
Then BFS from x to find the farthest node y. Why does this work? (Hint: x is one end of the diameter)
DP approach: For each node, compute the max depth in its subtree.
Combine the two largest child depths. Why does this work? (Hint: The diameter passes through some node) Try to implement both in your head. Which one is easier to code? Which one is easier to understand? Think for seconds, then continue.