Top K Elements: Introduction

Let’s go over the Top K Elements pattern, its real-world applications, and some problems we can solve with it.

Overview#

The top K elements pattern helps find some specific kk number of elements from the given data with optimum time complexity.

Many problems ask us to find the top, the smallest, or the most/least frequent kk elements in an unsorted list of elements. To solve such problems, sorting the list takes O(nlog(n))O(n \log(n)) time, then finding the kk elements takes O(k)O(k) time. However, the top kk elements pattern can allow us to solve the problem using O(n.logk)O(n. \log k) time without sorting the list first.

Which data structure can we use to solve such problems? The best data structure to keep track of the smallest or largest kk elements is heap. With this pattern, we either use a max-heap or a min-heap to find the smallest or largest kk elements, respectively.

For example, let’s look at how this pattern takes steps to solve the problem of finding the top kk largest elements (using min-heap) or top kk smallest elements (using max-heap):

  1. Insert the first kk elements from the given set of elements to the min-heap or max-heap.

  2. Iterate through the rest of the elements.

    1. For min-heap, if you find the larger element, remove the top (smallest number) of the min-heap and insert the new larger element.
    2. For max-heap, if you find the smaller element, remove the top (largest number) of the max-heap and insert the new smaller element.

Iterating the complete list takes O(n)O(n) time, and the heap takes O(logk)O(\log k) time for insertion. However, we get the O(1)O(1) access to the kk elements using the heap.

Let’s look at the following illustration to understand how we can use min-heap to find the top kk elements.

Created with Fabric.js 3.6.6 min-heap 3 5 9 3 5 9 4 2 12 15

1 of 5

Created with Fabric.js 3.6.6 4 5 9 min-heap 3 5 9 4 2 12 15

2 of 5

Created with Fabric.js 3.6.6 min-heap 3 5 9 4 2 12 15 4 5 9

3 of 5

Created with Fabric.js 3.6.6 min-heap 3 5 9 4 2 12 15 5 9 12

4 of 5

Created with Fabric.js 3.6.6 9 12 15 min-heap 3 5 9 4 2 12 15

5 of 5

Examples#

The following examples illustrate some problems that can be solved with this approach:

Created with Fabric.js 3.6.6 max-heap Sort characters by frequency String "buubble" Key Value b 3 u 2 l 1 e 1 3 : b 2 : u 1 : l 1 : e Output string "bbbuule" We'll calculate frequencies of each character and store them in the hash map. We'll use the frequencies to maintain the max-heap and sort characters by frequency. Hashmap

1 of 2

Created with Fabric.js 3.6.6 Connect n ropes with minimum cost First of all, insert all n ropes into the min-heap. Then, extract the minimum andthe second minimum from the min-heap. Add the extracted values and insert the calculated value back to the min-heap. Maintain the total cost variable to keeptrack of minimum cost while calculating the cost. Ropes 6 4 3 2 2 3 4 6 6 5 4 6 9 15 cost = 2 + 3 = 5 cost = 4 + 5 = 9 cost = 6 + 9 = 15 Minimum cost 5 + 9 + 15 = 29

2 of 2

Does my problem match this pattern?#

  • Yes, if both of these conditions are fulfilled:

    • We need to find the largest, smallest, most frequent, or least frequent subset of elements in an unsorted list.

    • This may be the requirement of the final solution, or it may be necessary as an intermediate step toward the final solution.

  • No, if any of these conditions is fulfilled:

    • The input data structure does not support random access.
    • The input data is already sorted according to the criteria relevant to solving the problem.
    • If only 11 extreme value is required, that is, k=1k = 1, as that problem can be solved in O(n)O(n) with a simple scan through the input array.

Real-world problems#

Many problems in the real world use the top K elements pattern. Let’s look at some examples.

  • Uber: Select at least the nn nearest drivers within the user’s vicinity, avoiding the drivers that are too far away.

  • Stocks: Given the set of IDs of brokers, determine the top K broker’s performance with the frequently repeated IDs in the given data set.

Strategy time!#

Match the problems that can be solved using the top K elements pattern.

Note: Select a problem in the left-hand column by clicking it, and then click one of the two options in the right-hand column.

Match The Answer
Select an option from the left-hand side

Rearrange the given string so that no two identical characters are adjacent to each other.

Explanation

Top K Elements

Detect a cycle in a linked list.

Explanation

Some other pattern

Find the five nearest points to the origin.

Explanation

Implement a stack in which the pop operation removes the most frequent element.

Explanation

Solution: Find All Anagrams in a String

K Closest Points to Origin