Non-overlapping Intervals
Problem Statement
Given an array of intervals intervals where intervals[i] = [starti, endi], return the minimum number of intervals you need to remove to make the rest of the intervals non-overlapping.
Note that intervals that only touch at a point are not considered overlapping. For example, [1, 2] and [2, 3] are non-overlapping.
Additional information
1 <= intervals.length <= 10^5
intervals[i].length == 2
-5 * 10^4 <= starti < endi <= 5 * 10^4
Example 1:
Input: intervals = [[1,2],[2,3],[3,4],[1,3]]
Output: 1
Explanation: [1,3] can be removed and the rest of the intervals are non-overlapping.
Example 2:
Input: intervals = [[1,2],[1,2],[1,2]]
Output: 2
Explanation: You need to remove two [1,2] to make the rest of the intervals non-overlapping.
Example 3:
Input: intervals = [[1,2],[2,3]]
Output: 0
Explanation: You don't need to remove any of the intervals since they're already non-overlapping.
Dynamic Programming (Longest Increasing Subsequence Variation)
Intuition
Instead of thinking about which intervals to remove, we can flip the problem: what is the maximum number of intervals we can keep without any overlaps? If we subtract this maximum number from the total number of intervals, we get our minimum removals!
This is very similar to the "Longest Increasing Subsequence" (LIS) problem. We can sort the intervals by their start time. Then, we create a dp array where dp[i] stores the maximum number of non-overlapping intervals ending with intervals[i]. For every interval, we look back at all previous intervals. If a previous interval ends before or exactly when our current interval starts, they don't overlap, and we can safely append our current interval to that sequence.
Algorithm
- If the input
intervals array is empty, return 0.
- Sort
intervals based on the start time (index 0).
- Store the total number of intervals in
n.
- Initialize a
dp array of size n filled with 1s (each interval is a valid sequence of length 1 by itself).
- Iterate
i from 1 to n - 1.
- For each
i, iterate j from 0 to i - 1.
- If the end time of interval
j is less than or equal to the start time of interval i (intervals[j][1] <= intervals[i][0]), we can chain them together! Update dp[i] = max(dp[i], dp[j] + 1).
- Find the maximum value in the
dp array, which represents the longest chain of non-overlapping intervals.
- Return
n - max(dp).
def erase_overlap_intervals(intervals: list[list[int]]) -> int:
if not intervals:
return 0
intervals.sort(key=lambda x: x[0])
n = len(intervals)
dp = [1] * n
for i in range(1, n):
for j in range(i):
# If the intervals don't overlap
if intervals[j][1] <= intervals[i][0]:
dp[i] = max(dp[i], dp[j] + 1)
return n - max(dp)
Time & Space Complexity
Time complexity: O(N^2)
Reason: We use a nested loop where for every interval N, we look back at all previous intervals. This results in a quadratic runtime, which will likely trigger a Time Limit Exceeded (TLE) error for large inputs like 10^5.
Space complexity: O(N)
Reason: The dp array requires memory strictly proportional to the number of intervals.
Greedy Algorithm - Optimal
Intuition
We can solve this efficiently in a single pass using a Greedy approach. The core realization is that if we want to leave the maximum amount of free space for future intervals, we should always pick the interval that ends the earliest!
If we sort all intervals by their end time, we can safely just pick the first interval. Because it ends earlier than any other option, it gives the remaining intervals the highest possible chance of fitting into the schedule.
As we iterate through the sorted list, we simply track the prev_end time. If the next interval starts before prev_end, it overlaps, and we must "remove" it. If it starts at or after prev_end, we keep it and update our prev_end pointer.
Algorithm
- Sort the
intervals array based solely on the end time (index 1).
- Initialize
removals = 0 to track how many overlapping intervals we skip.
- Initialize
prev_end = float("-inf") to represent the end time of the last successfully kept interval.
- Iterate through
start, end in the sorted intervals:
- If
start >= prev_end: The interval does not overlap! We keep it by updating prev_end = end.
- If
start < prev_end: The interval overlaps with our currently kept schedule. Since we sorted by end time, the current interval ends later than our kept interval, so it's a worse choice. We remove it by incrementing removals += 1.
- Return
removals.
def erase_overlap_intervals(intervals: list[list[int]]) -> int:
# Sort greedily by end time
intervals.sort(key=lambda x: x[1])
removals = 0
prev_end = float("-inf")
for start, end in intervals:
if start >= prev_end:
# No overlap, keep the interval
prev_end = end
else:
# Overlap detected, remove the interval
removals += 1
return removals
Time & Space Complexity
Time complexity: O(N log N)
Reason: Sorting the intervals array takes O(N log N) time. The single iteration through the array takes O(N) time. The sorting step strictly dominates the time complexity.
Space complexity: O(N)
Reason: Python's built-in Timsort algorithm takes O(N) auxiliary space in the worst-case scenario.