Merge Intervals — The Definitive Guide (LeetCode 56) [Google, Meta, Amazon, Microsoft]
Advertisement
Problem Statement
Given an array of intervals where
intervals[i] = [start_i, end_i], merge all overlapping intervals and return an array of the non-overlapping intervals that cover all the intervals in the input.
Example 1 — multiple merges:
Input: [[1,3],[2,6],[8,10],[15,18]]
Output: [[1,6],[8,10],[15,18]]
[1,3] and [2,6] overlap → merged into [1,6]
[8,10] does not overlap with [1,6] → kept as-is
[15,18] does not overlap with [8,10] → kept as-is
Example 2 — everything merges into one:
Input: [[1,4],[4,5]]
Output: [[1,5]]
[1,4] and [4,5] share endpoint 4, so they overlap → merged into [1,5]
- Why This Problem Matters
- The Key Insight — Why Sorting by Start Time Is Necessary
- The Overlap Condition
- The Off-by-One Trap — < vs <=
- The Algorithm Walk-Through
- Visual Dry Run
- Common Mistakes
- Solutions
- Python
- JavaScript
- Complexity Analysis
- Follow-up Questions
- LC 57 — Insert Interval (Medium)
- LC 435 — Non-overlapping Intervals (Medium)
- LC 252 — Meeting Rooms (Easy)
- LC 253 — Meeting Rooms II (Medium) — The Fan-Favorite Follow-up
- This Pattern Solves
- Key Takeaway
Why This Problem Matters
Merge Intervals shows up in roughly 15–20% of FAANG-level coding interviews. Google uses it to test greedy thinking. Meta uses it in phone screens for data pipeline roles. Amazon surfaces it in system design discussions about calendar services and resource allocation. Microsoft asks it when screening for cloud-scheduling features.
But the real reason interviewers love it is not the algorithm — the algorithm is short. They love it because it exposes whether you think carefully about boundary conditions, whether you instinctively know to sort before you scan, and whether you can extend a simple pattern to harder follow-up questions (Meeting Rooms II, etc.) without starting from scratch.
Beyond the interview room, the pattern has concrete real-world applications:
- Calendar conflict detection — Google Calendar merges your free-busy blocks before showing availability to external users scheduling a meeting.
- Cloud resource scheduling — AWS Batch and Azure Scheduler merge job time windows to calculate machine utilization.
- Network packet reassembly — TCP reassembles out-of-order byte ranges by merging overlapping segments.
- Genomics — DNA sequencing pipelines merge overlapping read alignments to reconstruct chromosomes.
Understanding this pattern deeply gives you a mental template that transfers across all of these domains.
The Key Insight — Why Sorting by Start Time Is Necessary
Before sorting, overlapping intervals can be anywhere in the array. You would need to compare every pair — O(n²) work — to find all overlaps.
After sorting by start time, something powerful happens: if interval A starts before interval B, the only way they can overlap is if A's end reaches into B's start. You never have to look backwards. Each interval only needs to be compared with the one immediately before it in the result list. A single left-to-right pass is enough.
Without sorting first, consider this input: [[8,10],[1,3],[2,6],[15,18]]
If you process left to right without sorting:
- Add
[8,10]to result. [1,3]doesn't overlap[8,10]→ add it. Result:[[8,10],[1,3]][2,6]overlaps[1,3]but you're comparing it to[1,3](the last item). This could work here, but only by luck of processing order.- Now you'd never go back to check if
[1,6]overlaps[8,10]— you'd need another pass.
Sorting eliminates all of this. After sorting by start time: [[1,3],[2,6],[8,10],[15,18]]. Now adjacency in the sorted array implies potential overlap, and a single pass produces the correct answer.
The sort is not optional. It is the algorithm.
The Overlap Condition
Two intervals [a, b] and [c, d] overlap if and only if:
c <= b (assuming a <= c, i.e., sorted by start time)
Visualized on a number line:
Case 1: Overlapping
[a ─────── b]
[c ────── d]
c <= b ✓ → merge to [a, max(b, d)]
Case 2: Touching at a single point
[a ──── b]
[c ──── d]
c == b ✓ → merge to [a, d]
Case 3: No overlap
[a ─── b]
[c ─── d]
c > b ✗ → keep both
After sorting by start time, when two intervals overlap, the merged result is always:
[min(a, c), max(b, d)] = [a, max(b, d)]
Because we sorted, a <= c is guaranteed, so min(a, c) = a. The only value we need to update is the end: take the maximum of the two ends.
The Off-by-One Trap — < vs <=
This is the single most common bug in this problem. If you write c < b instead of c <= b, you will fail to merge touching intervals like [1,4] and [4,5].
The problem statement treats two intervals as overlapping when they share even a single point. Endpoint 4 belongs to both [1,4] and [4,5], so they must merge into [1,5]. Using strict less-than (<) breaks this case.
Use <= for the overlap check. Always.
The Algorithm Walk-Through
Here is the complete algorithm in plain English:
- Sort the intervals by their start value.
- Seed the result array with the first interval.
- Walk through the remaining intervals one by one:
- Look at the last interval in the result (call it
last). - If the current interval's start is
<= last.end, they overlap. Extendlast.endtomax(last.end, current.end). Do not add a new entry. - If the current interval's start is
> last.end, they don't overlap. Append the current interval as a new entry.
- Look at the last interval in the result (call it
- Return the result array.
The key mental model: keep the last merged interval hot in memory, and keep stretching its right boundary as long as incoming intervals keep overlapping it. The moment an interval no longer reaches, close the current merged interval and start a new one.
Visual Dry Run
Input: [[1,3],[2,6],[8,10],[15,18]]
After sorting (already sorted): [[1,3],[2,6],[8,10],[15,18]]
Step 0 — seed result:
result = [[1,3]]
last = [1,3]
Step 1 — process [2,6]:
2 <= 3? YES → overlap!
Extend last.end to max(3, 6) = 6
result = [[1,6]]
last = [1,6]
Step 2 — process [8,10]:
8 <= 6? NO → no overlap
Append [8,10]
result = [[1,6],[8,10]]
last = [8,10]
Step 3 — process [15,18]:
15 <= 10? NO → no overlap
Append [15,18]
result = [[1,6],[8,10],[15,18]]
Final answer: [[1,6],[8,10],[15,18]]
Now let's trace a trickier case — a "swallowing" overlap where one interval is entirely inside another: [[1,10],[2,3],[4,5]]
Step 0 — seed result:
result = [[1,10]]
Step 1 — process [2,3]:
2 <= 10? YES → overlap!
Extend last.end to max(10, 3) = 10 (10 unchanged — [2,3] is swallowed)
result = [[1,10]]
Step 2 — process [4,5]:
4 <= 10? YES → overlap!
Extend last.end to max(10, 5) = 10 (still swallowed)
result = [[1,10]]
Final answer: [[1,10]]
This is why max(last.end, current.end) matters — you can't just overwrite with current.end because the current interval might be entirely contained inside the last merged one.
Common Mistakes
Mistake 1 — Skipping the sort. The most fundamental error. Without sorting, you can't guarantee that overlapping intervals are adjacent in your scan. Always sort by start time first. Time cost: O(n log n) — but this is the total complexity anyway.
Mistake 2 — Using strict < instead of <= for the overlap check. intervals[i][0] < last[1] will miss the case where two intervals share a single endpoint (e.g., [1,4] and [4,5]). Use intervals[i][0] <= last[1] to catch touching intervals.
Mistake 3 — Overwriting last.end without taking the max. Writing last.end = current.end instead of last.end = max(last.end, current.end) breaks on swallowed intervals. If the current interval ends before the last merged interval ends, you'd shrink the merged interval — the wrong answer. Always use max.
Mistake 4 — Forgetting to seed the result with the first interval. Starting with an empty result array and jumping straight into the loop means your first iteration has no "last" to compare against. Either seed the result with intervals[0] before the loop, or handle it inside the loop with a special case.
Solutions
Python
def merge(intervals: list[list[int]]) -> list[list[int]]:
# Step 1: Sort by start time so overlapping intervals become adjacent
intervals.sort(key=lambda x: x[0])
# Step 2: Seed result with the first interval
result = [intervals[0]]
for start, end in intervals[1:]:
last_start, last_end = result[-1]
# Step 3: Check overlap — current start is within the last merged interval
if start <= last_end:
# Extend the last merged interval's end (use max to handle swallowing)
result[-1][1] = max(last_end, end)
else:
# No overlap — start a new merged interval
result.append([start, end])
return result
JavaScript
/**
* @param {number[][]} intervals
* @return {number[][]}
*/
function merge(intervals) {
// Step 1: Sort by start time
intervals.sort((a, b) => a[0] - b[0]);
// Step 2: Seed result with the first interval
const result = [intervals[0]];
for (let i = 1; i < intervals.length; i++) {
const [start, end] = intervals[i];
const last = result[result.length - 1];
// Step 3: Check overlap
if (start <= last[1]) {
// Extend the last merged interval's end
last[1] = Math.max(last[1], end);
} else {
// No overlap — push a new interval
result.push([start, end]);
}
}
return result;
}
Both solutions are idiomatic, handle the "swallowing" edge case correctly, and use the <= overlap condition.
Complexity Analysis
| Time | Space | |
|---|---|---|
| Sorting | O(n log n) | O(1) or O(log n) for sort stack |
| Linear scan | O(n) | O(n) for output |
| Total | O(n log n) | O(n) |
Why O(n log n)? The sort dominates. The merge scan is a single left-to-right pass — each interval is looked at exactly once, making it O(n). But since O(n log n) + O(n) = O(n log n), the sort sets the ceiling.
Space: The output array holds at most n intervals (when nothing merges), so O(n) space. If sorting in-place is acceptable (modifying the input), auxiliary space drops to O(log n) for the recursive call stack of most sort implementations.
Can we do better than O(n log n)? Only if the intervals arrive pre-sorted. If they're already sorted, the merge scan alone is O(n). In practice, inputs are rarely pre-sorted, so O(n log n) is the de-facto optimal.
Follow-up Questions
These four problems build directly on the merge intervals pattern. Knowing this problem deeply makes all of them approachable.
LC 57 — Insert Interval (Medium)
You are given a list of non-overlapping, sorted intervals. Insert a new interval and merge as necessary.
Approach: Since the input is already sorted and non-overlapping, one linear pass is enough. Walk left to right in three phases: (1) collect all intervals that end before the new interval starts (intervals[i][1] < newInterval[0]), (2) merge all intervals that overlap with the new interval by expanding its boundaries, (3) collect all remaining intervals that start after the new interval ends. O(n) time.
LC 435 — Non-overlapping Intervals (Medium)
Find the minimum number of intervals to remove to make the rest non-overlapping.
Approach: Sort by end time (not start time — this is the twist). Walk left to right, keeping the end of the last non-overlapping interval. When you encounter an overlap, remove the interval with the later end (greedy: keep the one that ends sooner to leave room for future intervals). Count removals. This is essentially an interval scheduling maximization problem — find the maximum number of non-overlapping intervals, then subtract from n.
LC 252 — Meeting Rooms (Easy)
Given a list of meeting intervals, determine if a person can attend all meetings (no overlaps allowed).
Approach: Sort by start time. Check if any two consecutive meetings overlap (intervals[i][0] < intervals[i-1][1]). If yes, return false. O(n log n).
LC 253 — Meeting Rooms II (Medium) — The Fan-Favorite Follow-up
Find the minimum number of conference rooms required to hold all meetings.
Approach (Min-Heap / Priority Queue): Sort meetings by start time. Use a min-heap to track the end times of meetings currently in progress. For each new meeting, if it starts after the earliest-ending meeting (heap top ≤ current start), reuse that room (pop the heap). Otherwise, allocate a new room. Push the current meeting's end time. The heap's maximum size at any point equals the peak simultaneous meetings — and that is your answer.
Approach (Two-Pointer Sweep Line): Create separate sorted arrays for start times and end times. Use two pointers. If the next start is before the next end, a new room is needed (increment rooms, advance start pointer). Otherwise, a meeting ended and freed a room (decrement rooms, advance end pointer). Track the maximum rooms needed. O(n log n).
This problem is asked almost as frequently as Merge Intervals itself and is a natural escalation in technical rounds.
This Pattern Solves
Once you internalize the sort-then-scan template, you can apply it to many other problems:
- Minimum number of arrows to burst balloons (LC 452) — sort by end, greedily shoot where the current group ends
- Employee free time (LC 759) — flatten all employee schedules into one list, sort, find the gaps between merged intervals
- Interval List Intersections (LC 986) — two-pointer on two sorted interval lists, advance whichever pointer ends sooner
- Skyline Problem (LC 218) — sweep line with a max-heap tracking active building heights
- Maximum CPU Load — same as Meeting Rooms II, applied to jobs instead of meetings
- Task scheduler with cooldown periods — interval-based greedy with frequency counting
All of these reduce to: sort events or intervals, sweep left to right, maintain state about what is currently active.
Key Takeaway
The merge intervals algorithm is a two-step recipe: sort by start time to make overlapping intervals adjacent, then greedily extend the last merged interval whenever the next interval's start falls within its range. The critical details that separate correct from incorrect solutions are the <= overlap condition (which handles touching endpoints), and taking max when updating the end (which handles swallowed intervals). Master this pattern and you have a mental template that unlocks a whole family of interval, sweep-line, and scheduling problems that appear repeatedly in FAANG interviews and real-world system design.
Advertisement