703. Kth Largest Element in a Stream
Problem Description
In this problem, we are asked to design a KthLargest
class that can determine the kth largest element in a stream of numbers. Importantly, when we refer to the kth largest element, we mean according to sorted order, not that the element is distinct from the others. The numbers can be repeated in the stream.
The class should be able to handle two types of actions:
-
Initialization (
__init__
): When an instance of the class is created, it should be initialized with an integerk
and an array of integersnums
. Thek
represents the position of the largest element we are interested in, andnums
is the initial stream of numbers. -
Adding New Elements (
add
): The class should provide a method to add a new integerval
to the stream. After adding the new integer, this method should return the current kth largest element from the stream.
Intuition
The straightforward approach to find the kth largest element would be to sort the stream of numbers each time we insert a new element and then pick the kth largest. However, sorting every time we insert a new element leads to a less efficient solution. To solve the problem more optimally, we can use a data structure that does the hard work for us - a Min Heap.
The reason behind choosing a Min Heap is its useful property: the smallest element is always at the root and can be accessed in constant time. If we maintain a Min Heap of exactly k
largest elements from the current stream, the root of the heap (the smallest element in the heap) is our desired kth largest element.
Here's how we construct our solution using this intuition:
- Initialize a Min Heap (
self.q
) and a variableself.size
to storek
. - Insert the initial stream of numbers to the Min Heap using the
add
method.- During each insert, we push the new value into the heap.
- If the heap size exceeds
k
, it means we have more thank
elements, so we can remove the smallest one (the root). - The kth largest element will then be the new root of the heap.
- The
add
method maintains the Min Heap invariant and ensures it always contains thek
largest elements after each insertion. - Finally, we can return the kth largest element by looking at the root of the Min Heap.
Using this approach, we avoid the need to sort the entire stream every time, thereby improving the efficiency of finding the kth largest element in the stream.
Learn more about Tree, Binary Search Tree, Binary Tree, Data Stream and Heap (Priority Queue) patterns.
Solution Approach
To implement the solution, we employ Python's heapq
module, which provides an implementation of the Min Heap data structure.
Here's a step-by-step breakdown of the implementation:
-
When an instance of
KthLargest
class is created, the__init__
method is called with two parameters:k
andnums
.k
is stored inself.size
which represents the position of the kth largest element we are interested in.self.q
is initialized as an empty list, which will be used as the Min Heap.
-
The
nums
list is iterated over and each number is added to the Min Heap using theadd
method. -
The
add
method handles inserting a new element into the Min Heap and ensures the Min Heap contains only thek
largest elements.heappush(self.q, val)
is used to add the new valueval
to the Min Heap.- We check if the size of the heap exceeds
k
by comparinglen(self.q)
toself.size
. - If the heap size is greater than
k
, we remove the smallest element (root of the Min Heap) usingheappop(self.q)
to ensure that only thek
largest elements remain. - This keeps the heap size at max
k
elements at all times. - The root element, which can be accessed with
self.q[0]
, is the kth largest element because all elements bigger than it are also in the heap.
By following this approach, we ensure that:
- The add operation has a time complexity of O(log k) since it involves heap operations which work in logarithmic time relative to the number of elements in the heap.
- The space complexity of the solution is O(k) because the Min Heap holds at most
k
elements at any given time.
This provides a balanced and efficient way to continuously find the kth largest element in the stream without having to sort the entire array each time a new element is added.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let us consider an example to illustrate the solution approach.
Suppose k
is 3, and the initial array of numbers nums
is [4, 5, 8, 2]
. We want to design a KthLargest
class where we can add elements and always be able to find the 3rd largest element in the current stream.
Here is a step-by-step walkthrough of how the KthLargest
class processes the stream of numbers:
-
We create an instance of
KthLargest
by passing in our valuesk=3
andnums=[4,5,8,2]
. The__init__
method initializesself.size
to 3 andself.q
as an empty Min Heap. -
We add each element of
nums
into the Min Heap using theadd
method, ensuring that after every addition, the Min Heap holds at mostk
largest elements. Initially, the Min Heap (self.q) is empty, and we proceed as follows:- Add 4: self.q becomes [4].
- Add 5: self.q becomes [4, 5].
- Add 8: self.q becomes [4, 5, 8].
- Add 2: self.q remains [4, 5, 8] because 2 is not among the 3 largest elements.
Now the Min Heap contains k
largest elements, the kth largest (3rd largest) among them is 4 (the root of the Min Heap).
-
If we call the
add
method to insert a new element, let's say3
, we again ensure that the Min Heap only contains thek
largest elements.- Add 3: self.q remains [4, 5, 8] because 3 is not among the 3 largest elements. The 3rd largest element is still 4.
-
Let's insert a larger element,
9
. The Min Heap needs to accommodate this change, and the smallest element in the heap should be removed if the heap's size exceedsk
.- Add 9: self.q becomes [5, 8, 9] after removing 4 because the size exceeded
k
. Now the 3rd largest element is 5.
- Add 9: self.q becomes [5, 8, 9] after removing 4 because the size exceeded
So after adding the element 9, our Min Heap has elements [5, 8, 9], and the 3rd largest (kth largest) element is 5.
The class now effectively and efficiently maintains the 3rd largest element as we add new values into the stream. The add
method updates the Min Heap in logarithmic time, making the operation swift compared to the linear time complexity of sorting the numbers upon each insertion.
Solution Implementation
1from heapq import heappush, heappop
2
3class KthLargest:
4 def __init__(self, k: int, nums: List[int]):
5 # Initialize a min heap to store the k largest elements
6 self.min_heap = []
7 # Store the size k to know the kth largest value
8 self.k = k
9 # Add the initial elements to the heap
10 for num in nums:
11 self.add(num)
12
13 def add(self, val: int) -> int:
14 # Add the new value to the min heap
15 heappush(self.min_heap, val)
16 # If the size of the heap exceeds k, remove the smallest element
17 if len(self.min_heap) > self.k:
18 heappop(self.min_heap)
19 # The root of the min heap is the kth largest value
20 return self.min_heap[0]
21
22# Example usage:
23# kthLargest = KthLargest(3, [4, 5, 8, 2])
24# print(kthLargest.add(3)) # returns 4
25# print(kthLargest.add(5)) # returns 5
26# print(kthLargest.add(10)) # returns 5
27# print(kthLargest.add(9)) # returns 8
28# print(kthLargest.add(4)) # returns 8
29
1import java.util.PriorityQueue;
2
3// Class to find the kth largest element in a stream of numbers
4class KthLargest {
5
6 // Priority queue to maintain the smallest 'k' elements seen so far
7 private PriorityQueue<Integer> minHeap;
8
9 // The kth position we are interested in for the largest element
10 private int k;
11
12 /**
13 * Constructor to initialize the data structure and populate with initial elements.
14 *
15 * @param k The kth position to track in the list of largest elements.
16 * @param nums An array of initial numbers to be added to the kth largest tracker.
17 */
18 public KthLargest(int k, int[] nums) {
19 // Initialize a min-heap with the capacity to hold 'k' elements
20 this.minHeap = new PriorityQueue<>(k);
21 this.k = k;
22
23 // Add the initial elements to the kth largest tracker
24 for (int num : nums) {
25 add(num);
26 }
27 }
28
29 /**
30 * Adds a new number into the stream and returns the kth largest element.
31 *
32 * @param val The new number to be added to the stream.
33 * @return The kth largest element after adding the new number.
34 */
35 public int add(int val) {
36 // Always add the new value to the min-heap
37 minHeap.offer(val);
38
39 // If the size of the min-heap exceeds 'k', remove the smallest element
40 if (minHeap.size() > k) {
41 minHeap.poll();
42 }
43
44 // The root of the min-heap represents the kth largest element
45 return minHeap.peek();
46 }
47}
48
49/* Usage example:
50 * KthLargest kthLargest = new KthLargest(3, new int[]{4, 5, 8, 2});
51 * kthLargest.add(3); // returns 4
52 * kthLargest.add(5); // returns 5
53 * kthLargest.add(10); // returns 5
54 * kthLargest.add(9); // returns 8
55 * kthLargest.add(4); // returns 8
56 */
57
1#include <queue>
2#include <vector>
3
4// Define a class to find kth largest element in a stream of numbers.
5class KthLargest {
6public:
7 // Declare a min heap to keep track of the k largest elements.
8 std::priority_queue<int, std::vector<int>, std::greater<>> minHeap;
9 // Store the value of 'k' to know which largest element to keep track of.
10 int kthSize;
11
12 // Constructor for initializing the KthLargest class.
13 // k - The kth position to track.
14 // nums - Initial list of numbers from which we find the kth largest element.
15 KthLargest(int k, std::vector<int>& nums) {
16 kthSize = k;
17 // Add initial numbers to the heap
18 for (int num : nums) {
19 add(num);
20 }
21 }
22
23 // Function to add a number to the stream and return the kth largest element.
24 int add(int val) {
25 // Add the new number to the min heap.
26 minHeap.push(val);
27 // If the heap size is greater than k, remove the smallest element.
28 if (minHeap.size() > kthSize) {
29 minHeap.pop();
30 }
31 // Return the kth largest element (top of the min heap).
32 return minHeap.top();
33 }
34};
35
36/**
37 * The provided code snippet illustrates the use of the KthLargest class.
38 * It demonstrates the instantiation of a KthLargest object and subsequent calls to its add method.
39 */
40
1type ComparatorFunction = (a: number, b: number) => number;
2
3let kthLargestSize: number;
4let minHeap: MinHeap;
5
6interface MinHeap {
7 data: number[];
8 comparator: ComparatorFunction;
9 heapify: () => void;
10 peek: () => number | null;
11 offer: (value: number) => void;
12 poll: () => number | null;
13 bubbleUp: (index: number) => void;
14 bubbleDown: (index: number) => void;
15 swap: (index1: number, index2: number) => void;
16 size: () => number;
17}
18
19// Initializing the Kth largest element finder
20function KthLargestInit(k: number, nums: number[]): void {
21 kthLargestSize = k;
22 minHeap = createMinHeap();
23 nums.forEach(num => {
24 kthLargestAdd(num);
25 });
26}
27
28// Function to add a new value to the data collection
29function kthLargestAdd(val: number): number {
30 minHeap.offer(val);
31 if (minHeap.size() > kthLargestSize) {
32 minHeap.poll();
33 }
34 return minHeap.peek()!;
35}
36
37// Create a MinHeap with default values and a comparator
38function createMinHeap(): MinHeap {
39 const heap: MinHeap = {
40 data: [],
41 comparator: (a, b) => a - b,
42 heapify() {
43 if (this.size() < 2) return;
44 for (let i = 1; i < this.size(); i++) {
45 this.bubbleUp(i);
46 }
47 },
48 peek() {
49 return this.size() === 0 ? null : this.data[0];
50 },
51 // Adds a new value and bubbles it up to maintain heap property
52 offer(value) {
53 this.data.push(value);
54 this.bubbleUp(this.size() - 1);
55 },
56 // Removes and returns the root value of the heap
57 poll() {
58 if (this.size() === 0) {
59 return null;
60 }
61 const result = this.data[0];
62 const last = this.data.pop();
63 if (this.size() !== 0) {
64 this.data[0] = last!;
65 this.bubbleDown(0);
66 }
67 return result;
68 },
69 // Bubbles a value up the heap until the heap property is restored
70 bubbleUp(index) {
71 while (index > 0) {
72 const parentIndex = (index - 1) >> 1;
73 if (this.comparator(this.data[index], this.data[parentIndex]) < 0) {
74 this.swap(index, parentIndex);
75 index = parentIndex;
76 } else {
77 break;
78 }
79 }
80 },
81 // Bubbles a value down the heap until the heap property is restored
82 bubbleDown(index) {
83 const lastIndex = this.size() - 1;
84 while (true) {
85 let findIndex = index;
86 const leftIndex = index * 2 + 1;
87 const rightIndex = index * 2 + 2;
88 if (
89 leftIndex <= lastIndex &&
90 this.comparator(this.data[leftIndex], this.data[findIndex]) < 0
91 ) {
92 findIndex = leftIndex;
93 }
94 if (
95 rightIndex <= lastIndex &&
96 this.comparator(this.data[rightIndex], this.data[findIndex]) < 0
97 ) {
98 findIndex = rightIndex;
99 }
100 if (index !== findIndex) {
101 this.swap(index, findIndex);
102 index = findIndex;
103 } else {
104 break;
105 }
106 }
107 },
108 // Swaps two values in the heap
109 swap(index1, index2) {
110 [this.data[index1], this.data[index2]] = [this.data[index2], this.data[index1]];
111 },
112 // Returns the number of elements in the heap
113 size() {
114 return this.data.length;
115 }
116 };
117
118 heap.heapify();
119 return heap;
120}
121
122// Example of usage
123KthLargestInit(3, [4, 5, 8, 2]);
124console.log(kthLargestAdd(3)); // Should return the kth largest element after adding 3
125console.log(kthLargestAdd(5)); // Should return the kth largest element after adding 5
126console.log(kthLargestAdd(10)); // Should return the kth largest element after adding 10
127console.log(kthLargestAdd(9)); // Should return the kth largest element after adding 9
128console.log(kthLargestAdd(4)); // Should return the kth largest element after adding 4
129
Time and Space Complexity
Time Complexity:
-
The
__init__(self, k: int, nums: List[int])
method has a time complexity ofO(n * log(k))
wheren
is the length of thenums
list. This is because for every element in nums, theadd
operation is called which takesO(log(k))
time due to the heap operation and we perform thisn
times. -
The
add(self, val: int) -> int
method has a time complexity ofO(log(k))
. This is because theheappush
operation could take up toO(log(k))
time to maintain the heap properties when adding a new element, and similarlyheappop
operation, which is called only when the heap size exceedsk
, takesO(log(k))
.
Space Complexity:
- The space complexity of the whole class is
O(k)
since the heap that is maintained by the class never contains more thank
elements at any time.
Learn more about how to find time and space complexity quickly using problem constraints.
Which data structure is used to implement priority queue?
Recommended Readings
Everything About Trees A tree is a type of graph data structure composed of nodes and edges Its main properties are It is acyclic doesn't contain any cycles There exists a path from the root to any node Has N 1 edges where N is the number of nodes in the tree and
Binary Search Tree Intro Definition A Binary Search Tree or BST is a rooted binary tree with the value of each internal node being greater than all the values in the respective node's left subtree and less than the ones in its right subtree An empty tree is a BST since it satisfies the
Binary Tree Min Depth Prereq BFS on Tree problems bfs_intro Given a binary tree find the depth of the shallowest leaf node https algomonster s3 us east 2 amazonaws com binary_tree_min_depth png Explanation We can solve this problem with either DFS or BFS With DFS we traverse the whole tree looking for leaf nodes and record and update the minimum depth as we go With BFS though since we search level by level we are guaranteed to find the shallowest leaf node
Want a Structured Path to Master System Design Too? Don’t Miss This!