Find the Duplicate Number
Problem Statement
Given an array of integers nums containing n + 1 integers where each integer is in the range [1, n] inclusive.
There is only one repeated number in nums, return this repeated number.
You must solve the problem without modifying the array nums and uses only constant extra space.
Additional information
1 <= n <= 10^5
nums.length == n + 1
1 <= nums[i] <= n
- All the integers in
nums appear only once except for precisely one integer which appears two or more times.
Example 1:
Input: nums = [1, 3, 4, 2, 2]
Output: 2
Example 2:
Input: nums = [3, 1, 3, 4, 2]
Output: 3
Example 3:
Input: nums = [3, 3, 3, 3, 3]
Output: 3
Sorting (Sub-optimal)
Intuition
If we sort the array, any duplicate numbers will be adjacent to each other. We can simply iterate through the sorted array and check if nums[i] == nums[i+1].
Note: This approach modifies the array (sorting), which technically violates the problem constraints, but it is an important starting point.
Algorithm
- Sort the array
nums in ascending order.
- Iterate
i from 0 to n - 2.
- If
nums[i] == nums[i+1], return nums[i].
- Return
-1 (should not be reached).
def find_duplicate(nums: list[int]) -> int:
nums.sort()
for i in range(len(nums) - 1):
if nums[i] == nums[i+1]:
return nums[i]
return -1
Time & Space Complexity
Time complexity: O(n log n)
Reason: Sorting takes O(n log n).
Space complexity: O(1) or O(n)
Reason: Depending on the sorting implementation (e.g., Timsort uses O(n)).
Floyd's Cycle Detection (Optimal)
Intuition
Since the values are in the range [1, n], every value in the array can be treated as a pointer to an index.
Example: nums[0] = 3 implies a pointer from index 0 to index 3.
Because there is a duplicate number, at least two indices point to the same "next" index. This structure creates a cycle (loop) in the linked list representation. The duplicate number is the entrance to that cycle.
We can use Floyd's Tortoise and Hare algorithm:
- Phase 1 (Intersection): Move
slow pointer by 1 step and fast pointer by 2 steps until they collide. The collision point is somewhere inside the cycle.
- Phase 2 (Entrance): Reset
slow to the start (0). Move both pointers 1 step at a time. The point where they meet is the duplicate number (cycle entrance).
Algorithm
- Initialize
slow = 0 and fast = 0.
- Phase 1:
slow = nums[slow]
fast = nums[nums[fast]]
- Repeat until
slow == fast.
- Phase 2:
Reset slow2 = 0.
While slow != slow2:
slow = nums[slow]
slow2 = nums[slow2]
Return slow (or slow2).
def find_duplicate(nums: list[int]) -> int:
slow, fast = 0, 0
# Phase 1: Find intersection
while True:
slow = nums[slow]
fast = nums[nums[fast]]
if slow == fast:
break
# Phase 2: Find entrance
slow2 = 0
while True:
slow = nums[slow]
slow2 = nums[slow2]
if slow == slow2:
return slow
Time & Space Complexity
Time complexity: O(n)
Reason: In Phase 1, the pointers meet inside the cycle (linear steps). In Phase 2, they travel at most n steps to find the entrance.
Space complexity: O(1)
Reason: We only use two or three pointers. We do not modify the array.