Algorithms to Find Kth Largest/Smallest Element in an Array/List


Find the kth largest element in an unsorted array. Note that it is the kth largest element in the sorted order, not the kth distinct element.
For example,
Given [3,2,1,5,6,4] and k = 2, return 5.

Find the Kth element by Sorting using STL

Sort the array and return the kth number from the end. Sorting generally takes O(nlogn).

1
2
3
4
5
6
7
class Solution {
public:
    int findKthLargest(vector<int>& nums, int k) {
        sort(nums.begin(), nums.end());
        return nums[nums.size() - k];
    }
};
class Solution {
public:
    int findKthLargest(vector<int>& nums, int k) {
        sort(nums.begin(), nums.end());
        return nums[nums.size() - k];
    }
};

We can sort the array in reverse order:

1
2
3
4
5
6
7
class Solution {
public:
    int findKthLargest(vector<int>& nums, int k) {
        sort(nums.rbegin(), nums.rend());
        return nums[k - 1];
    }
};
class Solution {
public:
    int findKthLargest(vector<int>& nums, int k) {
        sort(nums.rbegin(), nums.rend());
        return nums[k - 1];
    }
};

Algorithm to Find Kth Smallest/Largest Element in the Array by Using the Heap

A Heap is a data structure that is also a tree. The heap satifies that any parent nodes are greater/smaller than its children nodes depending on whether it is a max heap or min heap. So if we make these numbers into a heap and remove its maximum (root value) K-1 times, we have its K-th largest number.

1
2
3
4
5
6
7
8
9
10
11
class Solution {
public:
    int findKthLargest(vector<int>& nums, int k) {
        make_heap(nums.begin(), nums.end());
        for (int i = 0; i < k - 1; i ++) {
            pop_heap(nums.begin(), nums.end());
            nums.pop_back();
        }
        return nums.front();
    }
};
class Solution {
public:
    int findKthLargest(vector<int>& nums, int k) {
        make_heap(nums.begin(), nums.end());
        for (int i = 0; i < k - 1; i ++) {
            pop_heap(nums.begin(), nums.end());
            nums.pop_back();
        }
        return nums.front();
    }
};

The pop_heap removes the front element (root) of the heap and move it to the last element, so the size of the heap array is shrinked by one.

Using Priority Queue to find the Kth largest/smallest element

Similarly, we can use a priority queue data structure and pop K-1 elements from the priority queue, then the next element to pop is the kth element we want. It takes O(N) to build a priority queue from a given list. Then removing K-1 elements require O(KLogN) time. Overall complexity is O(N + KLogN).

1
2
3
4
5
6
7
8
9
10
class Solution {
public:
    int findKthLargest(vector<int>& nums, int k) {
        priority_queue<int> pq(begin(nums), end(nums));
        for (int i = 0; i < k - 1; ++ i) {
            pq.pop();
        }
        return pq.top();
    }
};
class Solution {
public:
    int findKthLargest(vector<int>& nums, int k) {
        priority_queue<int> pq(begin(nums), end(nums));
        for (int i = 0; i < k - 1; ++ i) {
            pq.pop();
        }
        return pq.top();
    }
};

In general, if you build a priority queue by inserting N times which costs O(logN) insert, it requires O(NLogN). If you require the Kth smallest element (popping the smallest element from the priority queue first), we can use the following syntax with a custom comparator:

1
priority_queue<int, vector<int>, std::greater<int>> pq;
priority_queue<int, vector<int>, std::greater<int>> pq;

Binary Search Algorithm to Find the Kth Element in the List

If we count that there are K-1 numbers larger than x in the array, then x is the answer. At the begining, the min and max ranges can be set, with binary search, if there are more than K values larger than mid, then the value should be within the range [mid+1, max] otherwise [min, mid-1].

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
class Solution {
public:
    int countLargerOrEqualThanK(vector<int>& nums, int k) {
        auto c = 0;
        for (auto x: nums) {
            if (x >= k) {
                c ++;
            }
        }
        return c;
    }
 
    int findKthLargest(vector<int>& nums, int k) {
        int max = *max_element(nums.begin(), nums.end());
        int min = *min_element(nums.begin(), nums.end());
        while (min <= max) {
            int mid = min + (max - min) / 2;
            auto c = countLargerOrEqualThanK(nums, mid);
            if (c >= k) {
                min = mid + 1;
            } else {
                max = mid - 1;
            }
        }
        return max;
    }
};
class Solution {
public:
    int countLargerOrEqualThanK(vector<int>& nums, int k) {
        auto c = 0;
        for (auto x: nums) {
            if (x >= k) {
                c ++;
            }
        }
        return c;
    }

    int findKthLargest(vector<int>& nums, int k) {
        int max = *max_element(nums.begin(), nums.end());
        int min = *min_element(nums.begin(), nums.end());
        while (min <= max) {
            int mid = min + (max - min) / 2;
            auto c = countLargerOrEqualThanK(nums, mid);
            if (c >= k) {
                min = mid + 1;
            } else {
                max = mid - 1;
            }
        }
        return max;
    }
};

Binary search algorithm takes O(logn) complexity but counting the numbers takes O(n) and that is why the over complexity is O(n*logn).

We can use the minmax_element to obtain the smallest/largest element in the list at one pass (left and right boundary).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
class Solution {
public:
    int findKthLargest(vector<int>& nums, int k) {
        auto minmax = minmax_element(begin(nums), end(nums));
        int left = *minmax.first;
        int right = *minmax.second;
        while (left <= right) {
            int mid = left + (right - left) / 2;
            auto c = countLargerOrEqualThanK(nums, mid);
            if (c >= k) {
                left = mid + 1;
            } else {
                right = mid - 1;
            }
        }
        return right;
    }
    
private:    
    int countLargerOrEqualThanK(vector<int>& nums, int k) {
        auto c = 0;
        for (auto x: nums) {
            if (x >= k) {
                c ++;
            }
        }
        return c;
    }    
};
class Solution {
public:
    int findKthLargest(vector<int>& nums, int k) {
        auto minmax = minmax_element(begin(nums), end(nums));
        int left = *minmax.first;
        int right = *minmax.second;
        while (left <= right) {
            int mid = left + (right - left) / 2;
            auto c = countLargerOrEqualThanK(nums, mid);
            if (c >= k) {
                left = mid + 1;
            } else {
                right = mid - 1;
            }
        }
        return right;
    }
    
private:    
    int countLargerOrEqualThanK(vector<int>& nums, int k) {
        auto c = 0;
        for (auto x: nums) {
            if (x >= k) {
                c ++;
            }
        }
        return c;
    }    
};

Find Kth Element by using nth_element to Partition

The C++ nth_element makes sure the nth element is in its correct position. And it is O(n), so with the help of STL:nth_element, the overall time complexity to Find the Kth element will be O(n).

1
2
3
4
5
6
7
class Solution {
public:
    int findKthLargest(vector<int>& nums, int k) {
        nth_element(nums.begin(), nums.end() - k, nums.end());
        return nums[nums.size() - k];
    }
};
class Solution {
public:
    int findKthLargest(vector<int>& nums, int k) {
        nth_element(nums.begin(), nums.end() - k, nums.end());
        return nums[nums.size() - k];
    }
};

Partition the Array to Find the Kth Largest/Smallest Element

The following is essentially similar to the nth_element algorithm. Each iteration, we pick a pivot element and then we partition the array into two halves, the one that has all elements smaller than it and the others that are larger than it. The good thing is that we know where the pivot index is. Thus, we can iteratively partition the half (and not the other half which the Kth is not in it).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
class Solution {
public:
    int findKthLargest(vector<int>& nums, int k) {
        int left = 0;
        int right = nums.size() - 1;
        while (left <= right) {
            int pivotIndex = partition(nums, left, right);
            if (pivotIndex == nums.size() - k) {
                return nums[pivotIndex];
            }
            if (pivotIndex > nums.size() - k) {
                right = pivotIndex - 1;
            } else {
                left = pivotIndex + 1;
            }
        }
        return -1;
    }
    
private:
    int partition(vector<int> &nums, int low, int high) {
        int idx = low;
        for (int i = low; i < high; ++ i) {
            if (nums[i] <= nums[high]) {
                swap(nums[idx ++], nums[i]);
            }
        }
        swap(nums[idx], nums[high]);
        return idx;
    }
};
class Solution {
public:
    int findKthLargest(vector<int>& nums, int k) {
        int left = 0;
        int right = nums.size() - 1;
        while (left <= right) {
            int pivotIndex = partition(nums, left, right);
            if (pivotIndex == nums.size() - k) {
                return nums[pivotIndex];
            }
            if (pivotIndex > nums.size() - k) {
                right = pivotIndex - 1;
            } else {
                left = pivotIndex + 1;
            }
        }
        return -1;
    }
    
private:
    int partition(vector<int> &nums, int low, int high) {
        int idx = low;
        for (int i = low; i < high; ++ i) {
            if (nums[i] <= nums[high]) {
                swap(nums[idx ++], nums[i]);
            }
        }
        swap(nums[idx], nums[high]);
        return idx;
    }
};

The partition algorithm works like this: we swap the element at pivot idx if the current element is smaller than the pivot element. And at last, we swap the element at pivot index and the pivot element. This overall complexity is O(N).

Quickselect has an average complexity in O(n) but a worst case in O(n log(n)). It seems that there’s a theoretically better algorithm but unusable in practice (median of median). Introselect tends to act better in the worst case scenario for quickselect and is often use in conjunction.

–EOF (The Ultimate Computing & Technology Blog) —

GD Star Rating
loading...
1216 words
Last Post: The USB Flash Benchmark Tool - FlashBench
Next Post: Simple Javascript Unit Testing

The Permanent URL is: Algorithms to Find Kth Largest/Smallest Element in an Array/List

3 Comments

  1. Anon

Leave a Reply