Here is the core idea: start BFS from any random node. Find the farthest node from it. Call that node . Now run BFS from and find the farthest node from . Call that . The distance from to is the diameter.
Two BFS runs instead of runs. That is total time.
This works because one endpoint of the longest path will always be the farthest node from your starting point. I'll prove why in the next unit.
Space complexity is for the data structures used.