Problem Statement
Given two sorted arrays nums1 and nums2 of size m and n respectively, return the median of the two sorted arrays.
The overall run time complexity should be O(log (m+n)).
Additional information
nums1.length == m
nums2.length == n
0 <= m <= 1000
0 <= n <= 1000
1 <= m + n <= 2000
-10^6 <= nums1[i], nums2[i] <= 10^6
Example 1:
Input: nums1 = [1, 3], nums2 = [2]
Output: 2.00000
Explanation: merged array = [1, 2, 3] and median is 2.
Example 2:
Input: nums1 = [1, 2], nums2 = [3, 4]
Output: 2.50000
Explanation: merged array = [1, 2, 3, 4] and median is (2 + 3) / 2 = 2.5.
Brute Force (Merge)
Intuition
Since both arrays are already sorted, we can merge them into a single large sorted array using a two-pointer approach (similar to Merge Sort). Once merged, we can directly access the median element. If the total length is odd, the median is the middle element. If even, it is the average of the two middle elements.
Algorithm
- Create a new array
merged.
- Use two pointers,
i and j, starting at 0 for nums1 and nums2.
- Compare
nums1[i] and nums2[j]. Append the smaller one to merged and increment the corresponding pointer.
- Once one array is exhausted, append the remaining elements of the other array.
- If
len(merged) is odd, return the middle element.
- If even, return the average of the two middle elements.
def find_median_sorted_arrays(nums1: list[int], nums2: list[int]) -> float:
merged = []
i, j = 0, 0
while i < len(nums1) and j < len(nums2):
if nums1[i] < nums2[j]:
merged.append(nums1[i])
i += 1
else:
merged.append(nums2[j])
j += 1
merged.extend(nums1[i:])
merged.extend(nums2[j:])
total = len(merged)
if total % 2 == 1:
return float(merged[total // 2])
else:
return (merged[total // 2 - 1] + merged[total // 2]) / 2.0
- Time complexity: $O(m + n)$
- Reason: We iterate through both arrays completely to merge them.
- Space complexity: $O(m + n)$
- Reason: We create a new array to store the merged result.
Binary Search (Optimal)
Intuition
We don't actually need to merge the arrays; we just need to find the "partition" line that divides the combined set of elements into two equal halves (Left and Right).
If we cut nums1 at index i and nums2 at index j such that the number of elements on the left side (i + j) equals half the total elements, and all elements on the left are smaller than all elements on the right, we have found the median.
We perform a binary search on the smaller array to find the correct cut i. The cut j in the second array is then automatically determined by the equation: j = (total + 1) // 2 - i.
Algorithm
- Ensure
nums1 is the smaller array (swap if necessary) to minimize binary search range.
- Set
total = m + n and half = (total + 1) // 2.
- Initialize
l = 0, r = len(nums1).
- While
l <= r:
i = (l + r) // 2 (Partition index for nums1).
j = half - i (Partition index for nums2).
Determine edge values: Aleft (nums1[i-1]), Aright (nums1[i]), Bleft (nums2[j-1]), Bright (nums2[j]). Handle out-of-bounds with -infinity and +infinity.
Valid Partition Check:
If Aleft <= Bright and Bleft <= Aright: Partition is correct.
If total is odd: max(Aleft, Bleft).
If total is even: (max(Aleft, Bleft) + min(Aright, Bright)) / 2.
Return result.
Adjustment:
If Aleft > Bright: We took too many from nums1. Move left (r = i - 1).
Else: We took too few. Move right (l = i + 1).
def find_median_sorted_arrays(nums1: list[int], nums2: list[int]) -> float:
A, B = nums1, nums2
if len(B) < len(A):
A, B = B, A
total = len(A) + len(B)
half = (total + 1) // 2
l, r = 0, len(A)
while l <= r:
i = (l + r) // 2
j = half - i
Aleft = A[i - 1] if i > 0 else float("-inf")
Aright = A[i] if i < len(A) else float("inf")
Bleft = B[j - 1] if j > 0 else float("-inf")
Bright = B[j] if j < len(B) else float("inf")
if Aleft <= Bright and Bleft <= Aright:
# Odd: median is the max of the left partition
if total % 2:
return float(max(Aleft, Bleft))
# Even: average of max(left) and min(right)
return (max(Aleft, Bleft) + min(Aright, Bright)) / 2
elif Aleft > Bright:
r = i - 1
else:
l = i + 1
- Time complexity: $O(\log(\min(m, n)))$
- Reason: We perform binary search only on the smaller array.
- Space complexity: $O(1)$
- Reason: We only use a few pointer variables.