Your first instinct might be to use a heap. Push elements as they enter, pop as they leave. But heaps don't support efficient removal of arbitrary elements. When an element leaves the window, finding it in the heap takes time. What about a balanced BST like a TreeMap? That works in , which is decent.
But we can do better. The monotonic queue gives total time with space for the deque. The observation: in a sliding window, you only care about candidates that could become the maximum. Smaller elements behind a larger one will never matter.