Solution: Sliding Window Median
Let's solve the Sliding Window Median problem using the Two Heaps pattern.
Statement#
Given an integer array, nums, and an integer, k, there is a sliding window of size k, which is moving from the very left to the very right of the array. We can only see the k numbers in the window. Each time the sliding window moves right by one position.
Given this scenario, return the median of the each window. Answers within of the actual value will be accepted.
Constraints:
-
knums.length -
nums[i]
Solution#
So far, you’ve probably brainstormed some approaches and have an idea of how to solve this problem. Let’s explore some of these approaches and figure out which one to follow based on considerations such as time complexity and any implementation constraints.
Naive approach#
The naive solution is to use nested loops to traverse over the array. The outer loop ranges over the entire array, and the nested loop is used to iterate over windows of elements. For each window, we’ll first sort the elements and then compute the median. We’ll append this value to the median list and move the window one step forward.
The above algorithm will have a total time complexity of , that is, :
- Traversal: , where is the window size and is the number of elements in the array.
- Sorting: Assuming we perform sorting using quicksort, the complexity would be .
The space complexity would be because the space complexity of quicksort is .
Optimized approach using two heaps#
Since the median, let’s call it x, is the middle value in a list of sorted elements, we know that half of the elements will be smaller than (or equal to) x, and the other half will be greater than (or equal to) x.
We can divide the list into two halves. One half will store the smaller numbers—let’s call it small_list. The other half will store the larger numbers—let’s call it large_list. If the total number of elements is odd, we keep the element in small_list. The median will be the largest number in small_list if the number of elements is odd. Otherwise, if the total number of elements is even, the median will be the mean of the largest number in small_list and the smallest number in large_list. The best data structure for finding the smallest or largest number in a list of numbers is a heap.
-
We store the first half of the numbers of the window in a max-heap. We use a max-heap because we want to know the largest number in the first half of the window.
-
We use a min-heap to store the second half of the numbers, since we want to know the smallest number in the second half of the window.
Having said that, the two heaps pattern is a perfect fit for this problem.
While sliding the window over the array, as soon as an element leaves the window, we need to remove it form the heap, too. Removing an element from the heap takes time, where is the size of the heap. This is the time spent in finding the element in the heap. This makes our sliding function very costly. Therefore, we don’t remove an element right away. Instead, we maintain a hash map to keep track of the elements to be removed from the heaps. We only remove an element from the heap if it’s at the top of the heap, which reduces the time complexity to . In addition, if the element to be removed is not at the top of the heap, it doesn’t disturb the calculation of the medians.
Here’s what the algorithm will do:
-
Add
kelements tosmall_list. Since we’re implementing a max-heap, we’ll multiply each element with -1. -
Transfer the top elements from
small_listtolarge_list. We’ll multiply each element by -1 again to convert it to its original value. -
Append the median to the median list. In case of an odd window size, the median is the top element of
small_list. Otherwise, it’s the mean of the top elements of the two lists. -
We use a
balancevariable to check if members belonging to the sliding window are present in the max-heap (small_list). If there’s an extra element, we transfer the top element from the max-heap to the min-heap. The second highest element then springs to the top of the max-heap. Similarly, if more than elements end up in the min-heap (large_list), we restore the balance by popping the smallest element from the min-heap and adding it to the max-heap. -
Move the window one step forward and add the outgoing element to the hash map. If the top element of the small list or the large list is present in the hash map with a frequency greater than 0, we remove it from the respective heap.
-
Repeat steps 3–5 until all the numbers in the
numshave been processed.
1 of 18
2 of 18
3 of 18
4 of 18
5 of 18
6 of 18
7 of 18
8 of 18
9 of 18
10 of 18
11 of 18
12 of 18
13 of 18
14 of 18
15 of 18
16 of 18
17 of 18
18 of 18
Let’s look at the code for this solution below:
Solution summary#
To recap, the solution to this problem can be divided into the following parts:
- Populate max-heap with
kelements. - Transfer elements from the max-heap to the min-heap.
- If the window size is odd, the median is the top of the max-heap. Otherwise, it’s the mean of the top elements of the two heaps.
- Move the window forward and add the outgoing number in the hash map, which is used to track the outgoing numbers.
- Rebalance the heaps if they have more elements.
- If the top element of the max-heap or the min-heap is present in the hash map with a frequency greater than 0, this element is irrelevant. We remove it from the respective heap and the hash map.
- Repeat the process until all elements are processed.
Time complexity#
The time spent in creating the heaps is . As we process all elements in the loop, it runs for time, where is the size of the input array. Inside the loop, we push and pop elements from the heaps, which takes time. This is because the size of a heap can grow up to in the worst case. Therefore, the total time complexity of the algorithm above is . As , the time complexity can be represented as .
Space complexity#
As the size of a heap can grow up to in the worst case, the space occupied by it is . Similarly, the space occupied by the hash map will also be . As , the space complexity of the above algorithm is .
Sliding Window Median
Final Remarks