4. Median of Two Sorted Arrays
Problem Description
The problem presents us with two sorted arrays, nums1
and nums2
, with sizes m
and n
respectively. Our task is to find the median of these two sorted arrays combined. The median is the value that separates a dataset into two halves. In other words, half of the numbers in the dataset are less than the median, and half are greater than it. If there is an odd number of observations, the median is the middle number. If there is an even number of observations, the median is the average of the two middle numbers.
Given the arrays are already sorted, if we could merge them, finding the median would be trivial: take the middle value or the average of the two middle values. However, merging the arrays may not be efficient, specifically for large sizes of m
and n
. Therefore, we are asked to find an efficient approach with a runtime complexity of O(log (m+n)), which suggests that we should not actually merge the arrays, and instead, consider an algorithm similar in efficiency to binary search.
Intuition
The intuition behind the solution is to use a binary search due to the sorted nature of the arrays and the requirement of a O(log (m+n)) runtime complexity. To find the median of two sorted arrays, we could approach the problem as finding the 'kth' smallest element, where 'k' would be the median position in the combined array, which is (m+n+1)//2
for the lower median and (m+n+2)//2
for the upper median.
The solution uses a binary search which is applied recursively by defining a helper function f(i, j, k)
where i
and j
are the current indices in nums1
and nums2
respectively, and k
is the position of the element we are looking for in the merged array. If one of the arrays is exhausted (i >= m
or j >= n
), the kth element is directly found in the other array. When the position k
is 1, we simply take the minimum between the current elements pointed by i
and j
.
The main trick is dividing the problem in half at each step: We pick the middle element position (p = k // 2
) from the remaining elements to compare in each array and reduce the problem size by p
, discarding the smaller half each time. The variable x
corresponds to the element at position i + p - 1
in nums1
unless this position is outside the array bounds (in which case we use infinity), and similarly for y
in nums2
. We recursively call function f
with adjusted indices based on which half we are discarding.
Finally, since the median can be a single middle value (for odd total length) or the average of two middle values (for even total length), the solution calls the helper function twice, for the kth and (k+1)th positions, and returns their average.
Note: The use of inf
in the solution assumes a Python environment where inf
represents an infinite number, which serves as a stand-in when comparing out-of-bounds indices in one of the arrays.
Learn more about Binary Search and Divide and Conquer patterns.
Solution Approach
The solution employs a divide and conquer strategy which is effectively a tailored binary search across two arrays to find the median.
Here's a step by step breakdown of the approach:
-
Define a Helper Function: The
f(i, j, k)
function serves as the core for the binary search strategy, by recursively finding thekth
smallest element in the combined array made up bynums1
andnums2
. Parametersi
andj
are the starting points innums1
andnums2
, respectively, andk
is the number of steps away from the starting points to the target element in the merged array. -
Base Cases: The helper function has three essential base cases:
- If
i
orj
run past the length of their respective arrays (i >= m
orj >= n
), the answer is found in the other array. This is because once one array is exhausted, the remaining kth element must be in the other array. - If
k
is 1, we can't divide the search space any smaller, so we just return the minimum of the two element values thati
andj
are pointing at.
- If
-
Recursive Case: For the recursive case, we divide the remaining search space (
k
) in two by settingp = k // 2
. We then find the potentialpth
element in both arrays (x
andy
) if those positions are in bounds. -
Compare and Eliminate: We compare these potential candidates,
x
andy
, to eliminate the half of the search space that can't possibly contain the kth smallest element. Ifx
is smaller, we know that anything before and includingx
innums1
cannot contain the kth smallest, so we move the search space pastx
, by callingf(i + p, j, k - p)
. Similarly, ify
is smaller, we callf(i, j + p, k - p)
. -
Find Medians: Once we've established our helper function, we use it to find the medians depending on whether the combined length of
nums1
andnums2
is odd or even:- For an odd combined length, the median is at position
(m + n + 1) // 2
, so only one call tof(0, 0, (m + n + 1) // 2)
is necessary. - For an even combined length, we call the function twice with
(m + n + 1) // 2
and(m + n + 2) // 2
to derive two middle values and take their average.
- For an odd combined length, the median is at position
The use of "inf" ensures that the comparison logic remains correct even when all elements of one array have been accounted for, thus allowing the binary search to proceed correctly with the remaining array.
Data Structures:
- No additional data structures are used, aside from the input arrays and variables to store intermediate values.
Algorithm Patterns:
- Binary Search: The solution utilizes the binary search pattern in a recursive manner to find the kth smallest element in a merged sorted array without actually merging them.
- Divide and Conquer: The problem space is halved in each recursive call, efficiently narrowing down to the median value.
- Recursion: Through the recursive function
f(i, j, k)
, we are capable of diving into smaller subproblems and compositions of the problem.
Overall, the binary search pattern is applied in a unique way that leverages the sorted property of the arrays and the fact that the kth smallest element search can be performed in a log-linear time complexity given these properties.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's go through a small example to illustrate the solution approach using two sorted arrays nums1
and nums2
.
Consider the arrays nums1 = [1, 3]
and nums2 = [2]
. The combined array after merging would be [1, 2, 3]
, where the median is 2
because it is the middle element.
We will not actually merge the arrays. Instead, we will employ the divide and conquer binary search as described.
Step 1: Find the total length and calculate the median position.
The total length of the combined arrays is m + n = 2 + 1 = 3
, which is odd. The median position k
is (m + n + 1) // 2 = (3 + 1) // 2 = 2
.
Step 2: Begin the binary search with the helper function f(i, j, k)
.
We want to find the 2nd smallest element without merging, so we start by calling f(0, 0, 2)
. Here i
and j
are starting indices for nums1
and nums2
, and k
is 2.
Step 3: Handle the base and recursive cases.
Since k
is not 1 and we haven't exceeded array lengths, we find the middle index p
for the current k
, which is p = k // 2 = 1
.
Now, we look at the elements at index p - 1
(since arrays are zero-indexed) in each array. In nums1
, it's nums1[0]
, which is 1
, and in nums2
, it's nums2[0]
, which is 2
. (We will call these x
and y
respectively).
Step 4: Eliminate one half of the search space.
Since x < y
(1 < 2), we eliminate x
and elements to the left of x
. Now we call f(i + p, j, k - p)
, which translates to f(1, 0, 1)
because we're moving past the element 1
in nums1
.
Step 5: Reaching the base case.
Now k
is 1
, which means we just need to find the minimum of the current elements nums1[i]
and nums2[j]
. The array nums1
at index 1
has no more elements, and the array nums2
at index 0
has the element 2
.
Since the element in nums1
is out of bounds, we consider it infinity. Therefore, nums2[j]
which is 2
, is the smaller element.
Step 6: Return the result.
Since we were looking for the 2nd smallest element, 2
is the median, and we return this as the result.
Through this example, we successfully applied the binary search divide and conquer strategy to find the median of two sorted arrays without merging them, fulfilling the required O(log (m+n)) runtime complexity.
Solution Implementation
1from typing import List
2
3class Solution:
4 def findMedianSortedArrays(self, nums1: List[int], nums2: List[int]) -> float:
5 # Helper function to find the k-th smallest element in two sorted arrays
6 def findKthElement(index1: int, index2: int, k: int) -> int:
7 # If nums1 is exhausted, return k-th element from nums2
8 if index1 >= length1:
9 return nums2[index2 + k - 1]
10 # If nums2 is exhausted, return k-th element from nums1
11 if index2 >= length2:
12 return nums1[index1 + k - 1]
13 # If k is 1, return the smaller of the two starting elements
14 if k == 1:
15 return min(nums1[index1], nums2[index2])
16
17 # Compare elements at half of k in each array and call recursively
18 half_k = k // 2
19 midVal1 = nums1[index1 + half_k - 1] if index1 + half_k - 1 < length1 else float('inf')
20 midVal2 = nums2[index2 + half_k - 1] if index2 + half_k - 1 < length2 else float('inf')
21
22 # Recurse into the array with the smaller value or adjust indexes accordingly
23 if midVal1 < midVal2:
24 return findKthElement(index1 + half_k, index2, k - half_k)
25 else:
26 return findKthElement(index1, index2 + half_k, k - half_k)
27
28 # Get lengths of both arrays
29 length1, length2 = len(nums1), len(nums2)
30
31 # Find the median
32 # If the combined length is odd, the median is the middle element; if even, the average of the two middle elements
33 median_left = findKthElement(0, 0, (length1 + length2 + 1) // 2)
34 median_right = findKthElement(0, 0, (length1 + length2 + 2) // 2)
35
36 # Calculate the median and return it
37 return (median_left + median_right) / 2
38
1class Solution {
2 // Class variables to store the size of arrays and the arrays themselves.
3 private int sizeNums1;
4 private int sizeNums2;
5 private int[] nums1;
6 private int[] nums2;
7
8 // Main function to find the median of two sorted arrays.
9 public double findMedianSortedArrays(int[] nums1, int[] nums2) {
10 // Initialize class variables with array sizes and the arrays themselves.
11 sizeNums1 = nums1.length;
12 sizeNums2 = nums2.length;
13 this.nums1 = nums1;
14 this.nums2 = nums2;
15
16 // Median can be the average of the middle two values for even length combined arrays
17 // or the middle one for odd length combined arrays.
18 int leftMedian = findKthElement(0, 0, (sizeNums1 + sizeNums2 + 1) / 2);
19 int rightMedian = findKthElement(0, 0, (sizeNums1 + sizeNums2 + 2) / 2);
20
21 // The median is the average of the two middle numbers for even-sized arrays.
22 return (leftMedian + rightMedian) / 2.0;
23 }
24
25 // Helper function to find the k-th element.
26 private int findKthElement(int startNums1, int startNums2, int k) {
27 // Base cases for recursion.
28 if (startNums1 >= sizeNums1) {
29 return nums2[startNums2 + k - 1]; // Select k-th element from nums2
30 }
31 if (startNums2 >= sizeNums2) {
32 return nums1[startNums1 + k - 1]; // Select k-th element from nums1
33 }
34 if (k == 1) {
35 // If k is 1, then return the minimum of current starting elements.
36 return Math.min(nums1[startNums1], nums2[startNums2]);
37 }
38
39 // Calculate the mid point to compare elements.
40 int midIndex = k / 2;
41 // Assign INT_MAX if the mid point is beyond the array bounds.
42 int midValNums1 = startNums1 + midIndex - 1 < sizeNums1 ? nums1[startNums1 + midIndex - 1] : Integer.MAX_VALUE;
43 int midValNums2 = startNums2 + midIndex - 1 < sizeNums2 ? nums2[startNums2 + midIndex - 1] : Integer.MAX_VALUE;
44
45 // Discard half of the elements from the array which has smaller midValue.
46 // Because those can never contain the k-th element.
47 if (midValNums1 < midValNums2) {
48 return findKthElement(startNums1 + midIndex, startNums2, k - midIndex);
49 } else {
50 return findKthElement(startNums1, startNums2 + midIndex, k - midIndex);
51 }
52 }
53}
54
1class Solution {
2public:
3 double findMedianSortedArrays(vector<int>& nums1, vector<int>& nums2) {
4 // Calculate the lengths of both input arrays
5 int length1 = nums1.size();
6 int length2 = nums2.size();
7
8 // Define a lambda function to find the k-th element in the merged array
9 auto findKthElement = [&](int start1, int start2, int k) -> int {
10 // If start1 is beyond the end of nums1, k-th element is in nums2
11 if (start1 >= length1) {
12 return nums2[start2 + k - 1];
13 }
14 // If start2 is beyond the end of nums2, k-th element is in nums1
15 if (start2 >= length2) {
16 return nums1[start1 + k - 1];
17 }
18 // If k is 1, return the minimum of the first unchecked element in both arrays
19 if (k == 1) {
20 return std::min(nums1[start1], nums2[start2]);
21 }
22 // Find the midpoint of the remaining k elements to compare
23 int half_k = k / 2;
24
25 // Check the boundary conditions and select the value to compare
26 int value1 = start1 + half_k - 1 < length1 ? nums1[start1 + half_k - 1] : INT_MAX;
27 int value2 = start2 + half_k - 1 < length2 ? nums2[start2 + half_k - 1] : INT_MAX;
28
29 // If value1 is less than value2, the k-th element is not in the first half_k elements of nums1
30 // Otherwise, it is not in the first half_k elements of nums2
31 if (value1 < value2) {
32 return findKthElement(start1 + half_k, start2, k - half_k);
33 } else {
34 return findKthElement(start1, start2 + half_k, k - half_k);
35 }
36 };
37
38 // Find the median by averaging the middle two elements
39 // If the total length is odd, both a and b will be the same element (the true median)
40 int a = findKthElement(0, 0, (length1 + length2 + 1) / 2);
41 int b = findKthElement(0, 0, (length1 + length2 + 2) / 2);
42
43 // Calculate the final median and return it
44 return (a + b) / 2.0;
45 }
46};
47
1function findMedianSortedArrays(nums1: number[], nums2: number[]): number {
2 // Calculate the length of both input arrays
3 const lengthNums1 = nums1.length;
4 const lengthNums2 = nums2.length;
5
6 // Helper function to find the kth smallest element in the two sorted arrays
7 // starting from indices `indexNums1` and `indexNums2`.
8 const findKthElement = (indexNums1: number, indexNums2: number, k: number): number => {
9 // If the starting index for nums1 is out of range, return kth element from nums2
10 if (indexNums1 >= lengthNums1) {
11 return nums2[indexNums2 + k - 1];
12 }
13 // If the starting index for nums2 is out of range, return kth element from nums1
14 if (indexNums2 >= lengthNums2) {
15 return nums1[indexNums1 + k - 1];
16 }
17 // If k is 1, return the minimum of the first element of both arrays
18 if (k === 1) {
19 return Math.min(nums1[indexNums1], nums2[indexNums2]);
20 }
21
22 // Determine the k/2 element in each array or set to a very large number if it's out of range
23 const halfK = Math.floor(k / 2);
24 const nums1Key = indexNums1 + halfK - 1 < lengthNums1 ? nums1[indexNums1 + halfK - 1] : Number.MAX_VALUE;
25 const nums2Key = indexNums2 + halfK - 1 < lengthNums2 ? nums2[indexNums2 + halfK - 1] : Number.MAX_VALUE;
26
27 // Recur on the half of the arrays where the kth element is likely to be
28 return nums1Key < nums2Key
29 ? findKthElement(indexNums1 + halfK, indexNums2, k - halfK)
30 : findKthElement(indexNums1, indexNums2 + halfK, k - halfK);
31 };
32
33 // Calculate indices for median elements
34 const totalLength = lengthNums1 + lengthNums2;
35 const medianLeftIndex = Math.floor((totalLength + 1) / 2);
36 const medianRightIndex = Math.floor((totalLength + 2) / 2);
37
38 // Find the two median elements
39 const medianLeft = findKthElement(0, 0, medianLeftIndex);
40 const medianRight = findKthElement(0, 0, medianRightIndex);
41
42 // Calculate and return the median
43 // If the total length of the combined array is odd, medianLeft and medianRight will be the same
44 return (medianLeft + medianRight) / 2;
45}
46
Time and Space Complexity
The given Python code implements the solution to find the median of two sorted arrays using a divide and conquer approach. Let's analyze the time and space complexity of the code:
Time Complexity:
The time complexity of the function can be understood by considering the primary operation it performs - finding the k-th smallest element in two sorted arrays nums1
and nums2
. Each recursive call to the function f(i, j, k)
reduces the k
by approximately half (k - p
, where p = k // 2
), and at the same time, it incrementally discards p
elements from one of the arrays by advancing the index i
or j
.
Therefore, the k
reduces to 1
after approximately log(k)
steps. Since the initial k
is derived from the total number of elements in both arrays ((m + n + 1) // 2
or (m + n + 2) // 2
), the overall time complexity can be estimated as O(log(m + n))
.
Space Complexity:
There is no additional space allocated for storing elements. The space complexity is determined by the depth of the recursion stack. Since we have approximately log(m + n)
recursive calls due to the divide and conquer algorithm, and considering each recursive call uses a constant amount of space, the space complexity is O(log(m + n))
.
Learn more about how to find time and space complexity quickly using problem constraints.
What's the output of running the following function using input [30, 20, 10, 100, 33, 12]
?
1def fun(arr: List[int]) -> List[int]:
2 import heapq
3 heapq.heapify(arr)
4 res = []
5 for i in range(3):
6 res.append(heapq.heappop(arr))
7 return res
8
1public static int[] fun(int[] arr) {
2 int[] res = new int[3];
3 PriorityQueue<Integer> heap = new PriorityQueue<>();
4 for (int i = 0; i < arr.length; i++) {
5 heap.add(arr[i]);
6 }
7 for (int i = 0; i < 3; i++) {
8 res[i] = heap.poll();
9 }
10 return res;
11}
12
1class HeapItem {
2 constructor(item, priority = item) {
3 this.item = item;
4 this.priority = priority;
5 }
6}
7
8class MinHeap {
9 constructor() {
10 this.heap = [];
11 }
12
13 push(node) {
14 // insert the new node at the end of the heap array
15 this.heap.push(node);
16 // find the correct position for the new node
17 this.bubble_up();
18 }
19
20 bubble_up() {
21 let index = this.heap.length - 1;
22
23 while (index > 0) {
24 const element = this.heap[index];
25 const parentIndex = Math.floor((index - 1) / 2);
26 const parent = this.heap[parentIndex];
27
28 if (parent.priority <= element.priority) break;
29 // if the parent is bigger than the child then swap the parent and child
30 this.heap[index] = parent;
31 this.heap[parentIndex] = element;
32 index = parentIndex;
33 }
34 }
35
36 pop() {
37 const min = this.heap[0];
38 this.heap[0] = this.heap[this.size() - 1];
39 this.heap.pop();
40 this.bubble_down();
41 return min;
42 }
43
44 bubble_down() {
45 let index = 0;
46 let min = index;
47 const n = this.heap.length;
48
49 while (index < n) {
50 const left = 2 * index + 1;
51 const right = left + 1;
52
53 if (left < n && this.heap[left].priority < this.heap[min].priority) {
54 min = left;
55 }
56 if (right < n && this.heap[right].priority < this.heap[min].priority) {
57 min = right;
58 }
59 if (min === index) break;
60 [this.heap[min], this.heap[index]] = [this.heap[index], this.heap[min]];
61 index = min;
62 }
63 }
64
65 peek() {
66 return this.heap[0];
67 }
68
69 size() {
70 return this.heap.length;
71 }
72}
73
74function fun(arr) {
75 const heap = new MinHeap();
76 for (const x of arr) {
77 heap.push(new HeapItem(x));
78 }
79 const res = [];
80 for (let i = 0; i < 3; i++) {
81 res.push(heap.pop().item);
82 }
83 return res;
84}
85
Recommended Readings
Binary Search Speedrun For each of the Speedrun questions you will be given a binary search related problem and a corresponding multiple choice question The multiple choice questions are related to the techniques and template s introduced in the binary search section It's recommended that you have gone through at
https algomonster s3 us east 2 amazonaws com cover_photos dnd svg Divide and Conquer The main idea of divide and conquer is to split the main problem into two smaller components assuming that each one of the components had already been solved recursively and then try to solve the bigger
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Want a Structured Path to Master System Design Too? Don’t Miss This!