645. Set Mismatch

EasyBit ManipulationArrayHash TableSorting

Problem Description

In this problem, we're dealing with a particular set of integers ranging from 1 to n where due to an error, one of the numbers was duplicated, causing another number to be missing. Our task is to figure out which number was duplicated (appears twice) and which one is missing from the array nums given to us.

The key to solving this problem lies in taking advantage of the properties of exclusive or (XOR) operations and the specifics of the data corruption that occurred.

Intuition

The intuition behind the solution can be developed by considering the properties of XOR operation. We know that for any number x, doing x XOR x gives us 0, and x XOR 0 gives us x. Thus, XOR-ing a number with itself cancels it out.

With this property in mind, we can XOR all elements in the nums array with each integer from 1 to n. Ideally, if no number was duplicated and none missing, the result should be 0 because every number would cancel itself out. However, since we have one duplicate and one missing number, we'll be left with the XOR of these two numbers.

From there, we can distinguish between the duplicate and missing numbers by noticing that they must differ in at least one bit. We find this differing bit (the rightmost bit is usually a good choice), and use it as a mask to separate the numbers into two groups - one with this bit set and one without it.

Now, we apply XOR operation within each group, including the numbers from 1 to n. This separate XORing would give us the duplicate number and the missing number, but we don't know which is which yet. We verify by checking each number in nums again. The number that we come across in the array is the duplicate, and the other one is the missing number.

By applying XOR in such strategic ways, we leverage its ability to cancel equal pairs and isolate the odd ones out: in this case, our duplicate and missing numbers.

Learn more about Sorting patterns.

Solution Approach

There are multiple approaches to solving this problem, each using different algorithms, data structures, or patterns. Here, we'll discuss three distinct approaches.

Solution 1: Mathematics

Using the principle of arithmetical sums, we can find the difference between the sum of a complete set from 1 to n (s_1) and the sum of our corrupted array (s_2 after duplicates removal). The sum difference s - s_2 gives us the repeated element, while s_1 - s_2 identifies the missing element. This approach is direct and relies on simple arithmetic operations. However, it requires handling potentially large sums, which could lead to integer overflow issues if not careful.

Solution 2: Hash Table

This approach is more straightforward: construct a hash table to count the number of times each element appears in nums. Then, simply iterate through the range 1 to n and check the count for each number. If the count is 2, the number is duplicated; if 0, it's the missing number. The hash table provides a direct mapping from elements to their counts, making lookups efficient (O(1) on average).

Solution 3: Bit Operation (implemented in the provided code)

The bit manipulation solution is more complex but efficient, especially regarding space complexity. The idea revolves around XOR operations. We iterate through the array nums and range from 1 to n, XOR-ing each element and index. This process will give us xs, the XOR of the duplicate number a and the missing number b (xs = a XOR b).

To separate these two values, we find the rightmost bit that is set in xs (lb = xs & -xs). We use this bit to divide the numbers into two groups and perform XOR operations on those groups. After separating elements into two piles, we XOR all elements that have this bit set and all elements that don't. This will result in two numbers, one being a and the other b. To identify which is which, we iterate through the nums array again. The one we find in the array is our duplicate number.

Finally, we return the duplicate and missing numbers as a list [a, b], ensuring the order is correct based on the final check with the original array.

The brilliance of this approach lies in using the distinct patterns of XOR to segregate and identify unique elements while maintaining a very low space complexity.

Ready to land your dream job?

Unlock your dream job with a 2-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's illustrate the third solution approach with XOR operation using a small example.

Assume our array nums is [1, 2, 2, 4] for n = 4. The ideal array without any errors would be [1, 2, 3, 4].

We start by initializing xs to 0 and we XOR all elements in nums and all numbers from 1 to n. Doing so gives us:

xs = 1^2^2^4^1^2^3^4

After canceling out duplicate pairs (1^1, 2^2, and 4^4) we are left with:

xs = 2^3

xs is now the XOR of the duplicate number and the missing number.
Next, we find the rightmost set bit in xs, which we'll use as a bitmask to differentiate between the two numbers that didn't cancel out.

xs = 2^3 = 0010^0011 = 0001

The rightmost set bit is 1 (we can find it by performing xs & -xs).
We now use this bit as a mask to split the elements into two groups. The first group will have this bit set (3 for our case) and the second will not (2 for our case).

We perform XOR operations separately on these two groups alongside the numbers from 1 to n:
- With bit set: 3
- Without bit set: 1^2^2^4^1^4
After cancellation:
- With bit set: 3 (already isolated, no duplicates to cancel it out)
- Without bit set: 2 (since 1^1 and 4^4 cancel themselves out leaving 2^2 which, after XORing, gives 2)
Now, we have two results from each group: 3 and 2. We don't yet know which is the duplicate number and which is the missing one.
To find out, we have to compare it against the nums array. In our example, we find 2 in nums, meaning 2 is the duplicate number.
Since 3 did not appear in nums, it is the missing number.
We conclude that the duplicate number (a) is 2, and the missing number (b) is 3. Our ordered result would be [2, 3].

In this example walkthrough, we have effectively demonstrated how the third solution approach, which focuses on bit manipulation, can identify the duplicate and the missing numbers in an array with a corrupted sequence of integers.

Solution Implementation

1class Solution:
2    def findErrorNums(self, nums: List[int]) -> List[int]:
3        xor_result = 0
4        # First pass: calculate the XOR of all indices and the elements in nums
5        for index, value in enumerate(nums, 1):
6            xor_result ^= index ^ value
7
8        # Obtain the rightmost set bit (least significant bit)
9        # This will serve as a mask for segregating numbers
10        rightmost_set_bit = xor_result & -xor_result
11
12        # We will use `number_a` to find one of the numbers by segregating into two buckets
13        # The rightmost set bit will allow us to xor numbers into two groups
14        # One group where the bit is set and another where it's not
15        number_a = 0
16        for index, value in enumerate(nums, 1):
17            if index & rightmost_set_bit:
18                number_a ^= index
19            if value & rightmost_set_bit:
20                number_a ^= value
21      
22        # `number_b` is the other number we need to find
23        number_b = xor_result ^ number_a
24
25        # Check if `number_a` is the duplicated number or the missing number
26        # If `number_a` is found in nums, it is the duplicated one
27        for value in nums:
28            if value == number_a:
29                return [number_a, number_b]
30      
31        # If `number_a` is not in nums, it's the missing number and `number_b` is the duplicate
32        return [number_b, number_a]
33

1class Solution {
2    public int[] findErrorNums(int[] nums) {
3        // The length of the array
4        int length = nums.length;
5        // Variable to store the XOR of all indices and values
6        int xorResult = 0;
7
8        // XOR all the indices with their corresponding values
9        for (int i = 1; i <= length; ++i) {
10            xorResult ^= i ^ nums[i - 1];
11        }
12
13        // Find the rightmost set bit of xorResult
14        int rightmostSetBit = xorResult & -xorResult;
15
16        // Initialize variables to 0 that will store the two unique numbers
17        int number1 = 0;
18
19        // Separate the numbers into two groups and XOR them individually
20        for (int i = 1; i <= length; ++i) {
21            // Check for the rightmost set bit and XOR
22            if ((i & rightmostSetBit) > 0) {
23                number1 ^= i;
24            }
25            if ((nums[i - 1] & rightmostSetBit) > 0) {
26                number1 ^= nums[i - 1];
27            }
28        }
29      
30        // XORing with xorResult to find the other number
31        int number2 = xorResult ^ number1;
32      
33        // Iterate over the array to check which one is the duplicate number
34        for (int i = 0; i < length; ++i) {
35            if (nums[i] == number1) {
36                // If number1 is the duplicate, return number1 and number2 in order
37                return new int[] {number1, number2};
38            }
39        }
40      
41        // If number1 is not the duplicate then number2 is, so return number2 and number1 in order
42        return new int[] {number2, number1};
43    }
44}
45

1class Solution {
2public:
3    vector<int> findErrorNums(vector<int>& nums) {
4        int size = nums.size(); // Get the size of the input vector
5        int xorSum = 0; // Initialize variable for XOR sum
6
7        // Calculate XOR for all numbers from 1 to n and all elements in nums
8        for (int i = 1; i <= size; ++i) {
9            xorSum ^= i ^ nums[i - 1];
10        }
11
12        // Determine the rightmost set bit (which differs between the missing and duplicate)
13        int rightMostBit = xorSum & -xorSum;
14        int oneBitSetNum = 0; // This will hold one of the two numbers (missing or duplicate)
15
16        // Divide the numbers into two groups based on the rightMostBit and XOR within groups
17        for (int i = 1; i <= size; ++i) {
18            if (i & rightMostBit) { // If the bit is set
19                oneBitSetNum ^= i; // XOR for the range 1 to N
20            }
21            if (nums[i - 1] & rightMostBit) { // If the bit is set in the current element
22                oneBitSetNum ^= nums[i - 1]; // XOR with the nums elements
23            }
24        }
25
26        int otherNum = xorSum ^ oneBitSetNum; // Find the second number using the xorSum
27
28        // Determine which number is the missing number and which one is the duplicate
29        for (int i = 0; i < size; ++i) {
30            if (nums[i] == oneBitSetNum) {
31                // If we find oneBitSetNum in nums array, it is the duplicate
32                return {oneBitSetNum, otherNum}; // Return the duplicate and missing numbers
33            }
34        }
35
36        // If not found, then oneBitSetNum is the missing number and otherNum is the duplicate
37        return {otherNum, oneBitSetNum};
38    }
39};
40

1function findErrorNums(nums: number[]): number[] {
2    // Calculate the total number of elements in the array.
3    const length = nums.length;
4  
5    // Initialize xorSum to 0, which will be used to find the XOR sum of all elements and their indices.
6    let xorSum = 0;
7  
8    // Loop through the array and XOR both the index and value of each element.
9    for (let index = 1; index <= length; ++index) {
10        xorSum ^= index ^ nums[index - 1];
11    }
12  
13    // Determine the rightmost set bit in xorSum using two's complement.
14    const rightmostSetBit = xorSum & -xorSum;
15  
16    // Initialize a variable to hold one of the two numbers we're looking for.
17    let oneNumber = 0;
18  
19    // Loop through the array and indices to partition them based on the rightmostSetBit.
20    for (let index = 1; index <= length; ++index) {
21        if (index & rightmostSetBit) {
22            oneNumber ^= index;
23        }
24        if (nums[index - 1] & rightmostSetBit) {
25            oneNumber ^= nums[index - 1];
26        }
27    }
28  
29    // Determine the other number by XORing oneNumber with xorSum.
30    const otherNumber = xorSum ^ oneNumber;
31  
32    // Check which number is missing from the array and which number is duplicated.
33    // The number that appears in the nums array is the duplicated one,
34    // and the one that's missing is the number that should have been there.
35    return nums.includes(oneNumber) ? [oneNumber, otherNumber] : [otherNumber, oneNumber];
36}
37

Time and Space Complexity

The time complexity of the given code is O(n) where n is the length of the array nums. This is because each of the loops in the function (for i, x in enumerate(nums, 1): and for x in nums:) iterates over the list nums exactly once, and all operations within the loops are constant time operations.

The space complexity of the function is O(1), which means it uses a constant amount of extra space. Despite the input list size, the space consumed by the variables xs, lb, a, and b does not scale with n, thus having no additional space depending on the input size.

Learn more about how to find time and space complexity quickly using problem constraints.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

Question 1 out of 10

What are the two properties the problem needs to have for dynamic programming to be applicable? (Select 2)

Optimal substructure

Overlapping subproblems

Non-overlapping subproblems

Constant time subproblems