Median of Two Sorted Arrays [Hard] — Binary Search on Partition (The Definitive Guide)

Sanjeev SharmaSanjeev Sharma
19 min read

Advertisement

Problem Statement

Given two sorted arrays nums1 and nums2 of size m and n respectively, return the median of the two sorted arrays.

The overall run time complexity must be O(log(m + n)).

Example 1 — Odd total length:

Input:  nums1 = [1, 3],  nums2 = [2]
Output: 2.0
Reason: Merged array = [1, 2, 3]. Median is the middle element: 2.

Example 2 — Even total length:

Input:  nums1 = [1, 2],  nums2 = [3, 4]
Output: 2.5
Reason: Merged array = [1, 2, 3, 4]. Median = (2 + 3) / 2 = 2.5

Constraints:

  • 0 <= m, n <= 1000
  • 1 <= m + n <= 2000
  • -10^6 <= nums1[i], nums2[i] <= 10^6
  • The time complexity requirement is O(log(m + n)), which effectively means O(log(min(m, n))).

Why This Problem Matters

LeetCode 4 is the only Hard problem that appears on virtually every "must-know" FAANG list — not because the code is long, but because the insight it requires is genuinely non-obvious. It is a top-asked problem at Google, Amazon, Facebook, and Microsoft, frequently used as the "separating question" that filters candidates who truly understand binary search from those who just pattern-match.

Here is what makes it special: almost every binary search problem you have ever seen searches on array values — you are looking for a specific number inside the array. This problem is different. You are performing binary search on an abstract index space — the partition point — and the correctness condition is a relational check between four values, not a simple arr[mid] == target comparison.

That abstraction jump is exactly what interviewers are testing. Can you recognise that binary search is applicable even when there is no "target value" to look for? Can you translate a complex constraint (median of a merged array) into a binary search condition? If you can, you have demonstrated a level of algorithmic thinking that separates senior engineers from the rest.

Why O(log(m+n)) and not O(m+n)? The naive approach — merge both arrays and return the middle element — is O(m+n) in both time and space. The problem explicitly bans this. The only way to achieve logarithmic time on sorted data is binary search. But binary search on what? That is the puzzle.


The Naive Approach — Merge and Find (O(m+n))

Before building up to the optimal solution, it helps to see what you are trying to improve on.

The straightforward approach is the classic merge step from merge sort:

  1. Allocate a new array of size m + n.
  2. Merge nums1 and nums2 into it using two pointers.
  3. Return merged[(m+n)//2] for odd total, or (merged[(m+n)//2 - 1] + merged[(m+n)//2]) / 2 for even total.
# Naive — O(m+n) time, O(m+n) space — violates the constraint
def findMedianSortedArrays_naive(nums1, nums2):
    merged = []
    i, j = 0, 0
    while i < len(nums1) and j < len(nums2):
        if nums1[i] <= nums2[j]:
            merged.append(nums1[i]); i += 1
        else:
            merged.append(nums2[j]); j += 1
    merged.extend(nums1[i:]); merged.extend(nums2[j:])
    total = len(merged)
    if total % 2 == 1:
        return float(merged[total // 2])
    return (merged[total // 2 - 1] + merged[total // 2]) / 2.0

This is completely correct and easy to reason about. The problem? It is O(m+n). With m and n each up to 1000, you process up to 2000 elements. The interviewer wants O(log(min(m,n))). You need to cut the work down from 2000 operations to roughly 10.


The Partition Insight — What Is a Median, Really?

To get from O(m+n) to O(log(min(m,n))), you need to reframe what "finding the median" means at a conceptual level.

The median of a sorted array of length L is the value that splits the array into two equal halves:

  • If L is odd: the median is the single middle element. Left half has (L-1)/2 elements, right half has (L-1)/2 elements.
  • If L is even: the median is the average of the two middle elements. Both halves have L/2 elements.

Now, instead of merging and splitting, ask: can we find a split in each array simultaneously such that the left portion of array1 + left portion of array2 forms exactly the correct left half of the merged sorted sequence?

Concretely, define:

  • i = number of elements taken from nums1 into the left half
  • j = number of elements taken from nums2 into the left half

For the partition to be valid, two conditions must both hold:

Condition 1: i + j == (m + n + 1) // 2
             (the left halves together contain exactly half the elements)

Condition 2: maxLeft1 <= minRight2   AND   maxLeft2 <= minRight1
             (every element on the left is ≤ every element on the right)

Where:

  • maxLeft1 = nums1[i-1] (the largest element taken from nums1's left)
  • minRight1 = nums1[i] (the smallest element left in nums1's right)
  • maxLeft2 = nums2[j-1] (the largest element taken from nums2's left)
  • minRight2 = nums2[j] (the smallest element left in nums2's right)

Condition 1 determines j once we pick i. If i + j = half, then j = half - i. So we only have one free variable: i. Binary search on i.

Condition 2 is the correctness check. Because both arrays are sorted internally, we already know nums1[i-1] <= nums1[i] and nums2[j-1] <= nums2[j]. The only cross-checks we need are maxLeft1 <= minRight2 and maxLeft2 <= minRight1.

This is the key insight: we do not need to see the entire merged array. We only need to check four boundary values.


The Binary Search on Partition

Since we must have 0 <= i <= m (we can take zero elements or all elements from nums1), and j is fully determined by i, we binary search i in the range [0, m].

Always binary search the smaller array. If m > n, swap the arrays. This keeps the search space minimal (O(log(min(m,n)))) and guarantees j is never negative.

At each midpoint i, compute j = half - i and then evaluate Condition 2:

SituationWhat it meansAction
maxLeft1 <= minRight2 AND maxLeft2 <= minRight1Valid partition foundCompute and return median
maxLeft1 > minRight2i took too many elements from nums1; left side of nums1 is too largeMove i left: hi = i - 1
maxLeft2 > minRight1i took too few elements from nums1; left side of nums2 is too largeMove i right: lo = i + 1

The binary search terminates because:

  • We always move lo or hi strictly toward each other.
  • A valid partition is guaranteed to exist (the problem guarantees at least one element total).

Edge Cases — The Sentinel Values

When i = 0, there are no elements taken from nums1's left. nums1[i-1] would be an out-of-bounds access. Similarly, when i = m, there are no elements in nums1's right.

The clean fix: use sentinel values (conceptual negative and positive infinity):

maxLeft1  = nums1[i-1]  if i > 0  else -infinity
minRight1 = nums1[i]    if i < m  else +infinity

maxLeft2  = nums2[j-1]  if j > 0  else -infinity
minRight2 = nums2[j]    if j < n  else +infinity

Sentinel logic:

  • -infinity for a missing left element: it will never violate maxLeft <= minRight since no real value is smaller.
  • +infinity for a missing right element: it will never violate maxLeft <= minRight since no real value is larger.

This eliminates four separate if checks inside the validity test, keeping the core logic clean.


The Median Formula

Once you have found the valid partition:

If total length is odd ((m + n) % 2 == 1):

median = max(maxLeft1, maxLeft2)

The left half has one more element than the right. The median is the largest element of the combined left portion.

If total length is even ((m + n) % 2 == 0):

median = (max(maxLeft1, maxLeft2) + min(minRight1, minRight2)) / 2.0

Both halves are equal in size. The median is the average of the largest-left and smallest-right.

The formula half = (m + n + 1) // 2 — with the +1 — ensures that for odd totals, the left half always has the extra element. This is intentional: max(maxLeft1, maxLeft2) then directly gives the median for the odd case without any additional logic.


Visual Dry Run

Example 1: nums1 = [1, 3], nums2 = [2]

  • m = 2, n = 1. Since m > n, swap: A = [2] (length 1), B = [1, 3] (length 2).
  • half = (1 + 2 + 1) // 2 = 2
  • Binary search: lo = 0, hi = 1

Iteration 1: i = (0+1)//2 = 0, j = 2 - 0 = 2

maxLeft1  = A[-1]-inf   (i=0, no left in A)
minRight1 = A[0]   = 2
maxLeft2  = B[1]   = 3
minRight2 = B[2]+inf   (j=2=n, no right in B)

Check: maxLeft1(-inf) <= minRight2(+inf) ✓, maxLeft2(3) <= minRight1(2)? 3 > 2, FAIL. maxLeft2 > minRight1 → need more from A → lo = i + 1 = 1

Iteration 2: i = (1+1)//2 = 1, j = 2 - 1 = 1

maxLeft1  = A[0]   = 2
minRight1 = A[1]+inf   (i=1=m, no right in A)
maxLeft2  = B[0]   = 1
minRight2 = B[1]   = 3

Check: maxLeft1(2) <= minRight2(3) ✓, maxLeft2(1) <= minRight1(+inf) ✓ — Valid partition!

Total is odd (3): median = max(2, 1) = 2.0


Example 2: nums1 = [1, 2], nums2 = [3, 4]

  • m = 2, n = 2. Arrays already same length. Keep A = [1,2], B = [3,4].
  • half = (2 + 2 + 1) // 2 = 2
  • Binary search: lo = 0, hi = 2

Iteration 1: i = (0+2)//2 = 1, j = 2 - 1 = 1

maxLeft1  = A[0]   = 1
minRight1 = A[1]   = 2
maxLeft2  = B[0]   = 3
minRight2 = B[1]   = 4

Check: maxLeft1(1) <= minRight2(4) ✓, maxLeft2(3) <= minRight1(2)? 3 > 2, FAIL. maxLeft2 > minRight1 → need more from A → lo = i + 1 = 2

Iteration 2: i = (2+2)//2 = 2, j = 2 - 2 = 0

maxLeft1  = A[1]   = 2
minRight1 = A[2]+inf   (i=2=m)
maxLeft2  = B[-1]-inf   (j=0)
minRight2 = B[0]   = 3

Check: maxLeft1(2) <= minRight2(3) ✓, maxLeft2(-inf) <= minRight1(+inf) ✓ — Valid partition!

Total is even (4): median = (max(2, -inf) + min(+inf, 3)) / 2 = (2 + 3) / 2 = 2.5


Common Mistakes

These five bugs appear in the majority of failed submissions. Each causes either a wrong answer or a runtime error — neither is obvious at first glance.

Mistake 1 — Binary searching the larger array. Always swap so you binary search the smaller array. If you binary search nums1 when m > n, then j = half - i can go negative (since half < m is possible), causing out-of-bounds access. The swap costs O(1) and prevents the entire class of bugs.

Mistake 2 — Using wrong sentinel values. In Python, float('-inf') and float('inf') are the correct sentinels. In JavaScript, use -Infinity and Infinity — not Number.MIN_SAFE_INTEGER / Number.MAX_SAFE_INTEGER. Using safe integer boundaries fails on edge cases where actual array values are close to those bounds (e.g., [-10^6] and [10^6]).

Mistake 3 — Wrong median formula for the odd case. A frequent mistake is returning min(minRight1, minRight2) for the odd case (taking from the right half) instead of max(maxLeft1, maxLeft2) (taking from the left). The formula half = (m + n + 1) // 2 biases the left half to be larger, so the median for odd totals always lives at the top of the left half. If you use half = (m + n) // 2, you get the opposite bias and the formula breaks.

Mistake 4 — Off-by-one in binary search bounds. The search range must be lo = 0, hi = m (inclusive on both ends), not lo = 0, hi = m - 1. Why? i = m is a valid partition — it means all elements of nums1 go to the left half, and none go to the right. Capping hi at m - 1 silently misses this case, producing wrong answers specifically when nums1's largest element is less than everything in nums2.

Mistake 5 — Confusing which condition to check for binary search adjustment. When maxLeft1 > minRight2: you took too many elements from nums1 — move i left (hi = i - 1). When maxLeft2 > minRight1: you took too few elements from nums1 — move i right (lo = i + 1). Swapping these two adjustments produces a solution that binary searches in the wrong direction and loops forever or produces wrong results.


Solutions

Python — Naive Merge Approach (O(m+n))

def findMedianSortedArrays_naive(nums1: list[int], nums2: list[int]) -> float:
    """
    Merge both sorted arrays, then pick the middle element(s).
    Time: O(m+n)  Space: O(m+n)
    Correct, but violates the O(log(m+n)) constraint — shown for comparison only.
    """
    merged = []
    i, j = 0, 0

    # Standard merge-sort merge step
    while i < len(nums1) and j < len(nums2):
        if nums1[i] <= nums2[j]:
            merged.append(nums1[i]); i += 1
        else:
            merged.append(nums2[j]); j += 1

    # Append any remaining elements
    merged.extend(nums1[i:])
    merged.extend(nums2[j:])

    total = len(merged)
    mid = total // 2

    if total % 2 == 1:
        return float(merged[mid])                           # odd: single middle element
    return (merged[mid - 1] + merged[mid]) / 2.0           # even: average of two middles

Python — Optimal Binary Search on Partition (O(log(min(m,n))))

def findMedianSortedArrays(nums1: list[int], nums2: list[int]) -> float:
    """
    Binary search on the partition point of the smaller array.
    Time: O(log(min(m, n)))  Space: O(1)
    """
    A, B = nums1, nums2

    # Always binary search the smaller array so j = half - i is never negative
    if len(A) > len(B):
        A, B = B, A

    m, n = len(A), len(B)
    # half = size of the combined left half (left biased for odd totals)
    half = (m + n + 1) // 2

    lo, hi = 0, m  # i can be anywhere from 0 (take nothing from A) to m (take all of A)

    while lo <= hi:
        i = (lo + hi) // 2      # elements taken from A into the left half
        j = half - i            # elements taken from B into the left half (auto-determined)

        # Boundary sentinels: -inf means "no left element exists", +inf means "no right element exists"
        maxLeft1  = A[i - 1] if i > 0 else float('-inf')
        minRight1 = A[i]     if i < m else float('inf')

        maxLeft2  = B[j - 1] if j > 0 else float('-inf')
        minRight2 = B[j]     if j < n else float('inf')

        if maxLeft1 <= minRight2 and maxLeft2 <= minRight1:
            # Valid partition found — compute median
            if (m + n) % 2 == 1:
                # Odd total: median is the largest element of the combined left half
                return float(max(maxLeft1, maxLeft2))
            else:
                # Even total: average of the boundary elements across the partition
                return (max(maxLeft1, maxLeft2) + min(minRight1, minRight2)) / 2.0

        elif maxLeft1 > minRight2:
            # A's left is too large — move partition left in A
            hi = i - 1
        else:
            # B's left is too large — move partition right in A (takes more from A)
            lo = i + 1

    # Should never reach here given valid inputs
    return 0.0

JavaScript — Naive Merge Approach (O(m+n))

/**
 * Naive: merge both arrays and pick the middle element(s).
 * Time: O(m+n)  Space: O(m+n)
 * Shown for comparison — violates the problem's time constraint.
 */
function findMedianSortedArrays_naive(nums1, nums2) {
    const merged = [];
    let i = 0, j = 0;

    while (i < nums1.length && j < nums2.length) {
        if (nums1[i] <= nums2[j]) merged.push(nums1[i++]);
        else merged.push(nums2[j++]);
    }

    while (i < nums1.length) merged.push(nums1[i++]);
    while (j < nums2.length) merged.push(nums2[j++]);

    const total = merged.length;
    const mid = Math.floor(total / 2);

    if (total % 2 === 1) return merged[mid];
    return (merged[mid - 1] + merged[mid]) / 2;
}

JavaScript — Optimal Binary Search on Partition (O(log(min(m,n))))

/**
 * Binary search on partition point of the smaller array.
 * Time: O(log(min(m, n)))  Space: O(1)
 *
 * @param {number[]} nums1
 * @param {number[]} nums2
 * @return {number}
 */
function findMedianSortedArrays(nums1, nums2) {
    let A = nums1, B = nums2;

    // Ensure we binary search the smaller array
    if (A.length > B.length) [A, B] = [B, A];

    const m = A.length, n = B.length;
    // Left half size — biased up by 1 for odd totals so median lives at left's max
    const half = Math.floor((m + n + 1) / 2);

    let lo = 0, hi = m;

    while (lo <= hi) {
        const i = Math.floor((lo + hi) / 2); // elements from A in the left half
        const j = half - i;                  // elements from B in the left half

        // Sentinel values handle partitions at array boundaries cleanly
        const maxLeft1  = i > 0 ? A[i - 1] : -Infinity;
        const minRight1 = i < m ? A[i]     :  Infinity;
        const maxLeft2  = j > 0 ? B[j - 1] : -Infinity;
        const minRight2 = j < n ? B[j]     :  Infinity;

        if (maxLeft1 <= minRight2 && maxLeft2 <= minRight1) {
            // Found the correct partition — compute the median
            if ((m + n) % 2 === 1) {
                // Odd total: largest element of the combined left half
                return Math.max(maxLeft1, maxLeft2);
            } else {
                // Even total: average of the two boundary elements
                return (Math.max(maxLeft1, maxLeft2) + Math.min(minRight1, minRight2)) / 2;
            }
        } else if (maxLeft1 > minRight2) {
            // A's left portion is too large — move partition left in A
            hi = i - 1;
        } else {
            // B's left portion is too large — move partition right in A
            lo = i + 1;
        }
    }

    return 0; // Unreachable for valid inputs
}

Complexity Analysis

ApproachTimeSpaceNotes
Naive mergeO(m + n)O(m + n)Violates the constraint
Binary search on partitionO(log(min(m, n)))O(1)Always binary search smaller array

Why O(log(min(m,n))) and not O(log(m+n))?

The binary search range is [0, m] where m = min(len(nums1), len(nums2)). Each iteration halves this range. So the number of iterations is log2(m + 1), which is O(log(min(m,n))). The problem states O(log(m+n)) in the constraint, which is a looser bound — our solution is strictly better.

Space: The algorithm only stores a constant number of variables (lo, hi, i, j, four boundary values). No auxiliary arrays are allocated. O(1) space.


Follow-up Questions

These are the most common extensions asked in follow-up rounds at Google, Amazon, and Facebook after you solve the main problem:

1. Kth smallest element in two sorted arrays. This is a generalisation of the exact same problem. The median is just the (m+n)//2th (or (m+n+1)//2th) smallest element. You can solve the general Kth-smallest problem using a similar binary search approach: at each step, eliminate k//2 elements from consideration by comparing the k//2th element of each array. Time: O(log k).

2. Kth smallest element in a sorted matrix. Binary search on the value space (not index space): for a given candidate value mid, count how many elements in the matrix are <= mid using binary search on each row. Use this count to converge on the Kth smallest value. Time: O(n log(max - min)).

3. Weighted median. Each element has a weight. The weighted median is the value where the cumulative weight crosses 0.5. Binary search approach still applies but you sum weights during the partition check instead of counting elements.

4. Median of a data stream (LeetCode 295). If elements arrive one at a time, maintain a max-heap for the left half and a min-heap for the right half. Each insertion is O(log n). The median is always at the top of one or both heaps.

5. Find the kth smallest pair distance (LeetCode 719). Binary search on the answer (the distance value), then count pairs with distance <= mid. A direct application of the "binary search on abstract space" pattern introduced here.


This Pattern Solves

The "binary search on partition / binary search on answer" pattern generalises far beyond this problem. Once you recognise it here, you will start seeing it everywhere:

  • Split Array Largest Sum (LeetCode 410): Binary search on the maximum sum of a subarray.
  • Capacity to Ship Packages (LeetCode 1011): Binary search on the ship capacity value.
  • Koko Eating Bananas (LeetCode 875): Binary search on the eating speed.
  • Find Peak Element (LeetCode 162): Binary search on index where the "peak" condition flips.
  • H-Index II (LeetCode 275): Binary search on the h value.

In all of these, the key move is the same: reformulate the problem as "find the smallest/largest value of X such that a condition C(X) holds", then binary search on X.

For Median of Two Sorted Arrays, X is the partition index i, and C(i) is maxLeft1 <= minRight2 AND maxLeft2 <= minRight1.


Key Takeaways

The core insight of this problem is that the median does not require you to see the merged array — it only requires you to find a partition boundary where all left elements are at most all right elements. Because one partition index fully determines the other (their sum must equal half the total), you have a single free variable to binary search over.

The sentinel values (-inf for missing left elements, +inf for missing right elements) are not a hack — they are the mathematically correct way to express "no constraint exists from this side," and they eliminate four separate boundary checks that would otherwise obscure the algorithm's logic.

If you take one thing from this problem: binary search is not just for finding a value in an array. It is for any problem where you can reformulate the answer as a threshold on a monotone function and check each candidate in sub-linear time.

Advertisement

Sanjeev Sharma

Written by

Sanjeev Sharma

Full Stack Engineer · E-mopro