First Missing Positive [Hard] — The Definitive Index-Marking Guide [Amazon, Google, Microsoft]

Sanjeev SharmaSanjeev Sharma
17 min read

Advertisement

Problem Statement

Given an unsorted integer array nums, return the smallest positive integer that is not present in nums. You must implement an algorithm that runs in O(n) time and uses O(1) auxiliary space.

Examples:

Input:  [1, 2, 0]
Output: 3
Reason: 1 and 2 are present. Next missing positive is 3.

Input:  [3, 4, -1, 1]
Output: 2
Reason: 1 and 3 are present. 2 is missing.

Input:  [7, 8, 9, 11, 12]
Output: 1
Reason: 1 is not present at all — it is immediately missing.

Constraints:

  • 1 <= nums.length <= 10^5
  • -2^31 <= nums[i] <= 2^31 - 1
  • O(n) time, O(1) auxiliary space — no extra arrays, no hash structures

Why This Problem Matters

LeetCode 41 is rated Hard not because the idea is exotic, but because the constraints surgically eliminate every approach that comes naturally to a developer.

Every programmer's first instinct on "find what's missing" is one of two things: either dump everything into a hash set and query it, or sort the array and scan for the gap. Both are correct algorithms. Both are banned by the problem constraints.

  • Hash set: O(n) time but O(n) space. Eliminated.
  • Sorting: O(n) time? No — O(n log n). Eliminated.

This means the elimination of naive approaches is itself the interview question. The interviewer is not testing whether you know hash sets exist. They are testing whether you can identify that the problem's structure provides implicit storage — and exploit it.

The deeper skill being tested transfers directly to many real-world problems: when you cannot afford extra memory, you must encode information inside the data you already have. This is why Google, Amazon, and Microsoft all love this problem. It divides candidates who recognize structural constraints from those who pattern-match to memorized templates.


The Key Insight — Why the Answer Is Always in [1, n+1]

This is the mathematical foundation that makes the whole algorithm possible. If you understand this, the rest follows naturally.

Claim: For an array of length n, the smallest missing positive integer is always in the range [1, n+1].

Proof: Consider the best possible scenario — the array contains exactly the integers 1, 2, 3, ..., n. In that case every positive integer from 1 to n is accounted for, and the smallest missing positive is n+1. This is the only scenario where the answer reaches n+1. In every other scenario — where some integers in [1..n] are missing, replaced by negatives, zeros, duplicates, or numbers larger than n — the answer is strictly less than n+1, somewhere in [1..n].

Consequence: Any value in the array that is less than 1 or greater than n is completely irrelevant to the answer. We can treat such values as junk and overwrite them with a sentinel, because they could never be the missing positive we are looking for.

This is the unlocking insight. The array has exactly n slots. The answer lives in at most n+1 possible positions. We have exactly as much space as we need — the array itself.


Why Naive Approaches Fail

Before looking at the solution, it is worth being precise about why each naive approach is ruled out, because articulating this clearly in an interview signals deep understanding.

Approach 1 — Hash Set

# O(n) time, O(n) space — fails the space constraint
seen = set(nums)
i = 1
while i in seen:
    i += 1
return i

This works perfectly and is the "right" solution when there is no space constraint. But it allocates O(n) auxiliary space for the set. The problem says O(1). Eliminated.

Approach 2 — Sort then scan

# O(n log n) time, O(1) space — fails the time constraint
nums.sort()
expected = 1
for x in nums:
    if x == expected:
        expected += 1
return expected

This uses no extra memory and produces the correct answer. But comparison-based sorting is O(n log n) by the lower bound theorem. The problem requires O(n). Eliminated.

Approach 3 — Bit manipulation or bucket array

Creating a boolean bucket array seen[0..n] is just an explicit hash set — still O(n) space. Eliminated.

The constraint pair (O(n) time, O(1) space) exists precisely to rule all of these out and force the index-marking insight.


The Index-Marking Technique

The key realization: we can use the array's own indices as a presence map.

Index i represents the question "does the number i+1 exist in the array?" If we can mark index i as "visited" when we see the number i+1, we can answer all n+1 questions in one scan — using zero extra space. The trick is encoding "visited" without losing the original data: we use the sign of the stored value as the flag.

The 3-Pass Algorithm

Pass 1 — Clean up irrelevant values

Replace every value that is not in [1..n] with the sentinel n+1. This includes negatives, zeros, and numbers larger than n. After this pass, every value in the array is either a relevant positive in [1..n], or the sentinel n+1.

Why n+1 as the sentinel? Because n+1 is out of range — it will never be mistaken for a real answer, and it carries no meaning in the marking pass.

for i in range(n):
    if nums[i] <= 0 or nums[i] > n:
        nums[i] = n + 1

Pass 2 — Mark indices by negating

Iterate through the cleaned array. For each value v = abs(nums[i]) (we take absolute value because earlier markings may have already flipped the sign):

  • If v is in [1..n], it means the number v exists in the original array.
  • Mark that fact by negating nums[v - 1].

The sign of nums[v-1] becomes negative to say "the number v has been seen."

for i in range(n):
    v = abs(nums[i])       # must use abs() — earlier iterations may have negated this
    if 1 <= v <= n:
        nums[v - 1] = -abs(nums[v - 1])   # negate, but only if not already negative

Note the abs() on both sides: abs(nums[i]) recovers the original value if already marked, and -abs(nums[v-1]) avoids double-negating (which would incorrectly restore a positive value).

Pass 3 — Scan for first unmarked index

The first index i where nums[i] > 0 means the number i+1 was never seen. That is the answer. If every index is negative (every number 1..n was present), return n+1.

for i in range(n):
    if nums[i] > 0:
        return i + 1
return n + 1

Why the Cyclic Sort Alternative Works Too

There is a second O(n) time, O(1) space approach that thinks differently. Instead of marking indices with signs, it physically places each number at its correct position.

Core idea: if the answer lies in [1..n+1], then every number x in [1..n] belongs at index x-1. Try to arrange the array so that position 0 holds 1, position 1 holds 2, and so on. After this rearrangement, scan for the first position that does not match.

for i in range(n):
    while 1 <= nums[i] <= n and nums[nums[i] - 1] != nums[i]:
        j = nums[i] - 1
        nums[i], nums[j] = nums[j], nums[i]   # swap x to its correct slot

The while loop terminates because every swap either places a number correctly (reducing the number of misplaced elements) or encounters a duplicate (which stops the swap because nums[nums[i]-1] == nums[i]). The total number of swaps across the entire outer loop is at most n, so the overall time remains O(n).

After sorting: scan for the first i where nums[i] != i+1. That index gives the answer i+1. If all match, return n+1.

Both approaches achieve the same complexity. The index-marking approach is slightly easier to reason about correctness; cyclic sort is more elegant once you internalize the placement idea.


Visual Dry Run — [3, 4, -1, 1] Through All 3 Passes

Let us trace the index-marking approach step by step. n = 4.

Starting state:

Index0123
Value34-11

Pass 1 — Replace non-positives and values > n with n+1 = 5:

  • nums[0] = 3 — in range [1..4], keep.
  • nums[1] = 4 — in range [1..4], keep.
  • nums[2] = -1 — not positive, replace with 5.
  • nums[3] = 1 — in range [1..4], keep.
Index0123
Value3451

Pass 2 — Mark by sign negation:

Iteration i=0: v = abs(3) = 3. Mark index 2: nums[2] = -abs(5) = -5.

Index0123
Value34-51

Iteration i=1: v = abs(4) = 4. Mark index 3: nums[3] = -abs(1) = -1.

Index0123
Value34-5-1

Iteration i=2: v = abs(-5) = 5. 5 > 4 = n, skip.

Iteration i=3: v = abs(-1) = 1. Mark index 0: nums[0] = -abs(3) = -3.

Index0123
Value-34-5-1

End of Pass 2. The array now encodes: index 0 is negative (1 was seen), index 1 is positive (2 was NOT seen), index 2 is negative (3 was seen), index 3 is negative (4 was seen).


Pass 3 — Scan for first positive:

  • i=0: nums[0] = -3 < 0, skip.
  • i=1: nums[1] = 4 > 0answer is i+1 = 2.

Result: 2 — correct, because 2 was missing from [3, 4, -1, 1].


Common Mistakes — 4 Bugs That Kill This Solution

Bug 1 — Forgetting abs() in Pass 2

# WRONG — previous markings corrupt the index computation
v = nums[i]         # if nums[i] is already negative due to earlier marking,
                    # v is negative, and nums[v-1] is an invalid negative index

Always use v = abs(nums[i]). Previous passes may have turned positive values negative, so you must recover the original value before using it as an index.

Bug 2 — Double-negating in the marking step

# WRONG — if nums[v-1] is already negative (already marked), this restores it to positive
nums[v - 1] = -nums[v - 1]

The fix is -abs(nums[v-1]), which is idempotent: marking an already-negative value stays negative.

Bug 3 — Infinite loop in cyclic sort (wrong termination condition)

# WRONG — swaps two equal elements forever when duplicates exist
while 1 <= nums[i] <= n:
    swap(nums[i], nums[nums[i] - 1])

The while condition must also check nums[nums[i] - 1] != nums[i]. Without this, if nums[i] = 3 and both nums[i] and nums[2] equal 3 (a duplicate), the algorithm swaps position i with position 2 and immediately needs to swap back — infinite loop.

Bug 4 — Off-by-one in Pass 3

The final return when all positions are filled must be n+1, not n. If the array is [1, 2, 3, 4], n=4, every index is marked, and the answer is 5. Returning n would give 4, which is wrong.


Solutions

Python — Index Marking (Primary Approach)

from typing import List

class Solution:
    def firstMissingPositive(self, nums: List[int]) -> int:
        n = len(nums)

        # Pass 1: Replace all values outside [1..n] with sentinel n+1
        for i in range(n):
            if nums[i] <= 0 or nums[i] > n:
                nums[i] = n + 1

        # Pass 2: For each value v in [1..n], negate nums[v-1] to mark v as seen.
        # Use abs() because earlier iterations may have already negated nums[i].
        for i in range(n):
            v = abs(nums[i])
            if 1 <= v <= n:
                # -abs(...) is idempotent: already-negative stays negative
                nums[v - 1] = -abs(nums[v - 1])

        # Pass 3: First index with a positive value means that number was never seen.
        for i in range(n):
            if nums[i] > 0:
                return i + 1  # number i+1 is missing

        # All positions 1..n were present — answer is n+1
        return n + 1

Python — Cyclic Sort (Bonus Approach)

from typing import List

class Solution:
    def firstMissingPositive(self, nums: List[int]) -> int:
        n = len(nums)

        # Place each number x in [1..n] at index x-1.
        # The duplicate guard `nums[nums[i]-1] != nums[i]` prevents infinite loops.
        for i in range(n):
            while 1 <= nums[i] <= n and nums[nums[i] - 1] != nums[i]:
                j = nums[i] - 1
                nums[i], nums[j] = nums[j], nums[i]  # swap x to its home index

        # First index where the value doesn't match its expected number reveals the gap.
        for i in range(n):
            if nums[i] != i + 1:
                return i + 1

        return n + 1  # array was exactly [1, 2, ..., n]

JavaScript — Index Marking (Primary Approach)

/**
 * @param {number[]} nums
 * @return {number}
 */
var firstMissingPositive = function(nums) {
    const n = nums.length;

    // Pass 1: Sanitize — replace anything outside [1..n] with sentinel n+1
    for (let i = 0; i < n; i++) {
        if (nums[i] <= 0 || nums[i] > n) {
            nums[i] = n + 1;
        }
    }

    // Pass 2: Mark presence — negate the value at index (v-1) when v is seen
    for (let i = 0; i < n; i++) {
        const v = Math.abs(nums[i]);  // recover original value (may be already negated)
        if (v >= 1 && v <= n) {
            // -Math.abs(...) is idempotent: won't un-mark an already-marked slot
            nums[v - 1] = -Math.abs(nums[v - 1]);
        }
    }

    // Pass 3: First positive slot means that number was never marked as present
    for (let i = 0; i < n; i++) {
        if (nums[i] > 0) {
            return i + 1;  // number i+1 is missing
        }
    }

    return n + 1;  // every number 1..n was present
};

JavaScript — Cyclic Sort (Bonus Approach)

/**
 * @param {number[]} nums
 * @return {number}
 */
var firstMissingPositive = function(nums) {
    const n = nums.length;

    // Place each valid number at its correct index (value v goes to index v-1)
    for (let i = 0; i < n; i++) {
        // Keep swapping until nums[i] is out of range or already in its correct slot.
        // The duplicate guard prevents an infinite loop when two equal values point
        // at the same target index.
        while (nums[i] >= 1 && nums[i] <= n && nums[nums[i] - 1] !== nums[i]) {
            const j = nums[i] - 1;
            [nums[j], nums[i]] = [nums[i], nums[j]];
        }
    }

    // Scan for the first slot where the value doesn't match the expected number
    for (let i = 0; i < n; i++) {
        if (nums[i] !== i + 1) {
            return i + 1;
        }
    }

    return n + 1;
};

Complexity Analysis

Index-Marking Approach

ComplexityExplanation
TimeO(n)Three independent linear passes. No nested loops.
SpaceO(1)All marking is done in-place using sign bits. No auxiliary data structures.

Why the marking pass is truly O(n): Each element is visited once in Pass 2. Negating a value is O(1). The abs() calls add no asymptotic cost. The three passes add to 3n iterations — still O(n).

Cyclic Sort Approach

ComplexityExplanation
TimeO(n)The inner while loop may run multiple times, but each swap places one element permanently at its correct index. Across all n iterations of the outer loop, the total number of swaps is at most n. Two passes total.
SpaceO(1)All swaps are in-place. Only a constant number of index variables used.

The key insight for cyclic sort's time complexity: each element is swapped to its final position at most once, because the swap condition requires nums[nums[i]-1] != nums[i]. After a value lands at index v-1, it will never be moved again (the guard condition stops it). Therefore the total swap count across all iterations is bounded by n.


Follow-up Questions

Missing Number (LeetCode 268) — Easy

"Given n distinct numbers in range [0, n], find the one that is missing."

Simpler version — either use Gauss's formula n*(n+1)/2 - sum(nums), or XOR all indices with all values. No index marking needed because values are guaranteed distinct and range is fixed.

What this shares with LC 41: The same bounding argument — exactly one number from a known range is missing. LC 41 generalizes this to arbitrary arrays.

Find All Numbers Disappeared in an Array (LeetCode 448) — Easy

"Find all numbers in [1, n] that do not appear in an array of length n."

Direct application of index marking. Mark by negating for each value seen. Then collect all indices where the value is still positive — those represent missing numbers. You may return multiple answers. Essentially LC 41's Pass 2 and Pass 3, but collecting all gaps instead of the first.

Find the Duplicate Number (LeetCode 287) — Medium

"Given n+1 integers in [1, n], find the duplicate. O(1) space."

The classic solution uses Floyd's cycle detection (fast/slow pointers), treating the array as a linked list where nums[i] is the "next" pointer. The index-marking insight applies here too: if you try to mark and a slot is already marked negative, the current value is the duplicate.


This Pattern Solves

The "array as presence map" technique — whether via sign negation or cyclic sort — applies to a broader family of problems:

ProblemPattern
LC 41 — First Missing PositiveSign negation marking, find first unmarked
LC 268 — Missing NumberGauss formula or XOR variant of same range idea
LC 448 — Find All Disappeared NumbersSign negation marking, collect all unmarked
LC 287 — Find the DuplicateCyclic sort: two numbers compete for same slot
LC 442 — Find All Duplicates in an ArraySign negation: slot already negative = duplicate
LC 645 — Set MismatchCombined: find marked slot (duplicate) and unmarked slot (missing)

The unifying idea: when the values and indices of an array are drawn from the same range, indices can double as a presence set. Negation, cyclic sort, and XOR are three different ways to encode "seen" without extra space.


Key Takeaway

First Missing Positive is hard precisely because it bans the easy answers and forces you to see the array as both a container and a data structure simultaneously. The mathematical insight — that the answer must lie in [1, n+1] — is what makes in-place marking possible: it bounds the domain to exactly the size of the array. Once you see that, the three-pass sign-negation algorithm falls out naturally: clean up irrelevant values, use sign bits as presence flags, and read the first unmarked slot.

Master this problem and you have internalized one of the most transferable techniques in competitive programming: treating the array itself as a hash map. LeetCode 448, 442, 287, and 645 are all direct applications of the same idea.

Advertisement

Sanjeev Sharma

Written by

Sanjeev Sharma

Full Stack Engineer · E-mopro