2564. Substring XOR Queries
Problem Description
In this problem, we are given a binary string s
and an array of queries, where each query is represented by a pair of integers [first_i, second_i]
. The goal is to find the shortest substring within s
that, when converted to a decimal value val
and XORed with first_i
, yields second_i
. In simpler terms, for each query, we want to find a substring of s
such that val ^ first_i == second_i
.
The result of each query should be an array [left_i, right_i]
indicating the 0-indexed start and end positions of the substring within s
. If there is more than one possible substring, we select the one with the smallest left_i
. If no such substring exists, we return [-1, -1]
for that query.
A substring, in this context, is defined as a sequential chunk of characters from the string s
without altering the order or skipping any characters in between.
Intuition
The intuition behind the solution is to use a dictionary to keep track of all possible substrings and their decimal values. We iterate over the string s
, and at each position, we try to build substrings of different lengths (up to 32-bits, considering the problem constraints) while keeping track of their decimal values.
We use the following steps:
- Create an empty dictionary
d
to store the substrings. - Iterate over the binary string
s
(using indexi
for the start of the substring). - Within each iteration of
i
, we construct substrings and their decimal values by appending one bit at a time (using indexj
), shifting the previously calculated value to the left, and adding the new bit. - If the decimal value
x
of the current substring has not been seen before, we store it in the dictionary with its starting and ending indexes[i, i + j]
. - If the value is zero, we can break the inner loop early since zero XORed with any number gives that number, and we can't find any smaller substring with a value of zero.
- Now that we have precomputed the dictionary with substrings and their decimal representations, we can answer each query by looking for the decimal value that, when XORed with
first_i
, givessecond_i
. We use XOR in the query because of the property that ifa ^ b = c
, thena = b ^ c
.
For every query [first, second]
, we perform the XOR operation first ^ second
and look up the result in the dictionary. If there's a match, we return the corresponding substring's start and end indexes; otherwise, we return [-1, -1]
for that query.
The benefit of this approach is that it avoids recalculating decimal values of the same substrings for different queries, leveraging the properties of XOR and the lookup efficiency of a dictionary.
Solution Approach
The solution approach utilizes a lookup table (Python dictionary) to store the decimal values of all possible substrings and their corresponding starting and ending positions. Here's the step-by-step breakdown of the implemented algorithm:
-
Initialization: A Python dictionary
d
is created to serve as the lookup table. -
Outer Loop - Iterating over String
s
:- An outer for-loop is set up with variable
i
which iterates from 0 to the length of strings
minus 1. Eachi
represents the starting position of the potential substrings.
- An outer for-loop is set up with variable
-
Inner Loop - Building the Substrings:
-
For each position
i
, an inner for-loop with variablej
is used to try to build substrings of increasing size, sequentially appended by one character at a time (up to a maximum of 31 iterations for a 32-bit limit). -
During construction, a temporary variable
x
is used to calculate the decimal representation of the substring using bit manipulation. At each iteration,x
is left-shifted by one bit (multiplied by 2) and the next bit ofs
is taken into account by using a bitwise OR (denoted by|
) with the result. This is equivalent to appending the next character ofs
and converting the new binary string to a decimal value.x = x << 1 | int(s[i + j])
-
-
Updating the Lookup Table:
- If the current decimal value
x
does not already exist in the dictionaryd
, it is added, with its value being a list[i, i + j]
, representing the start and end indexes of the current substring. - If
x
is 0, we can break from the loop early because no smaller substring than the already found will have the same value, due to the nature of XOR.
- If the current decimal value
-
Handling Queries:
-
After populating the lookup table, we can efficiently answer each query using list comprehension.
-
For each query
[first, second]
in thequeries
list, perform the XOR operationfirst ^ second
and look it up in the dictionaryd
:d.get(first ^ second, [-1, -1])
-
If the result of the XOR operation is found in
d
, the starting and ending positions are returned. If the XOR result is not found,[-1, -1]
is returned, indicating no such substring exists.
-
By converting substrings to their decimal values once and storing them in the dictionary, the algorithm avoids redundant calculations for different queries. Bit manipulation and a lookup table are central to the algorithm's efficiency.
This algorithm greatly optimizes the process of finding the required substrings, especially when dealing with multiple queries, as it leverages the constant time complexity (average case) of dictionary lookups in Python.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a small example to illustrate the solution approach:
Assume our binary string s
is "1011"
and we have a single query queries = [[2, 3]]
. We want to find the shortest substring that, when converted to decimal and XORed with 2
, results in 3
.
Following the solution approach:
-
Initialization: We start by initializing an empty Python dictionary
d
. -
Outer Loop - Iterating over String
s
:- We iterate over string
s
. In our example, our string is "1011", hencei
will run from 0 to 3.
- We iterate over string
-
Inner Loop - Building the Substrings:
- For
i = 0
, we try to build substrings starting ats[0]
.- For
j = 0
, our substring is "1" (in binary), which equals1
in decimal, so we store{1: [0, 0]}
in our dictionary. - For
j = 1
, our substring is "10", which equals2
in decimal, so{2: [0, 1]}
is added to the dictionary. - Continuing, we get "101" (
5
in decimal) and "1011" (11
in decimal), resulting in{5: [0, 2]}
and{11: [0, 3]}
added to the dictionary.
- For
- We repeat this for
i = 1, 2, 3
, each time constructing new substrings and converting them to decimal numbers and adding to the dictionary if they are not yet present.- For
i = 1
(andj = 0
), the substring is "0", which as per step 4 of the implementation does not need to be stored since 0 XORed with any number will not help us find a unique answer we do not already have. - For
i = 2
(andj = 0
), the substring is "11" (3
in decimal), so{3: [2, 2]}
is added to the dictionary. i = 3
only gives us the substring "1" again, which is already in the dictionary, so we do not add it.
- For
The final dictionary
d
after processing will look like this:d = { 1: [0, 0], 2: [0, 1], 5: [0, 2], 11: [0, 3], 3: [2, 2] }
- For
-
Handling Queries:
- Now, we answer our query
[2, 3]
. We compute2 ^ 3
which equals1
. We look up the value1
in our dictionary. - We find it as
d[1]
, which is[0, 0]
.
- Now, we answer our query
Therefore, for the query [2, 3]
, the answer is [0, 0]
. The shortest substring from s
starting at index 0 and ending at index 0 when converted to decimal and XORed with 2
results in 3
.
This illustrates how the algorithm prepares a lookup table for efficient query handling based on substring decimal values. It leverages the dictionary data structure to preemptively store all possible substrings and their indexes, significantly speeding up the processing of each query.
Solution Implementation
1class Solution:
2 def substring_xor_queries(self, s: str, queries: List[List[int]]) -> List[List[int]]:
3
4 # Dictionary to store XOR values as keys and their substring indices as values.
5 xor_to_indices = {}
6 n = len(s)
7
8 # Iterate through each character in the string.
9 for i in range(n):
10 # Initialize XOR value.
11 x = 0
12 # Iterate through bit positions up to 32 bits.
13 for j in range(32):
14 # Break if the substring index exceeds string length.
15 if i + j >= n:
16 break
17 # Shift XOR value to left and add the current bit.
18 x = x << 1 | int(s[i + j])
19 # Store the substring indices in the dictionary if XOR value is not present.
20 if x not in xor_to_indices:
21 xor_to_indices[x] = [i, i + j]
22 # Break if XOR value becomes 0 to avoid unnecessary calculations.
23 if x == 0:
24 break
25
26 # Process queries to find substring indices for XOR values obtained from the queries.
27 return [xor_to_indices.get(first ^ second, [-1, -1]) for first, second in queries]
28
1import java.util.HashMap;
2import java.util.Map;
3
4class Solution {
5 public int[][] substringXorQueries(String s, int[][] queries) {
6 // Create a map to store the XOR value with its corresponding substring indices.
7 Map<Integer, int[]> xorToIndexMap = new HashMap<>();
8 int stringLength = s.length();
9
10 // Iterate through each possible starting index for substrings.
11 for (int startIndex = 0; startIndex < stringLength; ++startIndex) {
12 int xorValue = 0;
13
14 // Build the XOR value for the substring starting at 'startIndex'
15 // and ending before reaching the 32-bit limit or the end of string.
16 for (int offset = 0; offset < 32 && startIndex + offset < stringLength; ++offset) {
17 xorValue = xorValue << 1 | (s.charAt(startIndex + offset) - '0');
18 // Store only the first occurrence of each XOR value.
19 xorToIndexMap.putIfAbsent(xorValue, new int[] {startIndex, startIndex + offset});
20
21 // If the XOR value is 0, no need to continue further for this substring.
22 if (xorValue == 0) {
23 break;
24 }
25 }
26 }
27
28 // Number of queries to process.
29 int numberOfQueries = queries.length;
30 // Array to store the results of the queries.
31 int[][] results = new int[numberOfQueries][2];
32
33 // Process each query.
34 for (int i = 0; i < numberOfQueries; ++i) {
35 int firstQueryValue = queries[i][0];
36 int secondQueryValue = queries[i][1];
37 // Calculate the XOR value for the given query.
38 int queryXor = firstQueryValue ^ secondQueryValue;
39 // Fetch the substring indices corresponding to the XOR value or default to [-1, -1] if not found.
40 results[i] = xorToIndexMap.getOrDefault(queryXor, new int[] {-1, -1});
41 }
42
43 return results;
44 }
45}
46
1#include <vector>
2#include <string>
3#include <unordered_map>
4using namespace std;
5
6class Solution {
7public:
8 // Function to process the XOR queries on substrings
9 vector<vector<int>> substringXorQueries(string s, vector<vector<int>>& queries) {
10 // Dictionary to store the XOR values and their corresponding start and end indices
11 unordered_map<int, vector<int>> xorDictionary;
12 // Length of the input string
13 int length = s.size();
14
15 // Iterating through each character of the string
16 for (int i = 0; i < length; ++i) {
17 int xorValue = 0;
18
19 // Generate XOR for substrings starting at i and having a length of up to 32 bits
20 for (int j = 0; j < 32 && i + j < length; ++j) {
21 // Shift the XOR value left by 1 bit and add the current bit value
22 xorValue = (xorValue << 1) | (s[i + j] - '0');
23
24 // If the XOR value has not been seen before, store it with its indices
25 if (!xorDictionary.count(xorValue)) {
26 xorDictionary[xorValue] = {i, i + j};
27 }
28
29 // If the XOR value is 0, break, as all future XORs will remain 0
30 if (xorValue == 0) {
31 break;
32 }
33 }
34 }
35
36 // Vector to store the results of the queries
37 vector<vector<int>> results;
38
39 // Process each query
40 for (auto& query : queries) {
41 int startIndex = query[0], endIndex = query[1];
42 int xorQueryValue = startIndex ^ endIndex; // Compute the XOR of the given indices
43
44 // Check if the XOR value exists in our dictionary
45 if (xorDictionary.count(xorQueryValue)) {
46 // If so, add the corresponding indices to the results
47 results.emplace_back(xorDictionary[xorQueryValue]);
48 } else {
49 // Otherwise, add {-1, -1} to indicate the value isn't found
50 results.push_back({-1, -1});
51 }
52 }
53
54 // Return the results of the queries
55 return results;
56 }
57};
58
1// Importing necessary functionality
2import { string } from "prop-types";
3
4// Method to process the XOR queries on substrings
5function substringXorQueries(s: string, queries: number[][]): number[][] {
6 // Dictionary to store the XOR values and their corresponding start and end indices
7 const xorDictionary: { [key: number]: number[] } = {};
8 // Length of the input string
9 const length: number = s.length;
10
11 // Iterating through each character of the string
12 for (let i = 0; i < length; ++i) {
13 let xorValue = 0;
14
15 // Generate XOR for substrings starting at i and having a length of up to 32 bits
16 for (let j = 0; j < 32 && i + j < length; ++j) {
17 // Shift the XOR value left by 1 bit and add the current bit value
18 xorValue = (xorValue << 1) | (s.charCodeAt(i + j) - '0'.charCodeAt(0));
19
20 // If the XOR value has not been seen before, store it with its indices
21 if (xorDictionary[xorValue] === undefined) {
22 xorDictionary[xorValue] = [i, i + j];
23 }
24
25 // If the XOR value is 0, break, as all future XORs will remain 0
26 if (xorValue === 0) {
27 break;
28 }
29 }
30 }
31
32 // Array to store the results of the queries
33 const results: number[][] = [];
34
35 // Process each query
36 for (const query of queries) {
37 const startIndex = query[0], endIndex = query[1];
38 const xorQueryValue = startIndex ^ endIndex; // Compute the XOR of the given indices
39
40 // Check if the XOR value exists in our dictionary
41 if (xorDictionary[xorQueryValue] !== undefined) {
42 // If so, add the corresponding indices to the results
43 results.push(xorDictionary[xorQueryValue]);
44 } else {
45 // Otherwise, add [-1, -1] to indicate the value isn't found
46 results.push([-1, -1]);
47 }
48 }
49
50 // Return the results of the queries
51 return results;
52}
53
54// Example of how to use the function
55const s = "11010";
56const queries = [[0, 1], [1, 2]];
57const results = substringXorQueries(s, queries);
58console.log(results); // Expected output based on the function's logic
59
Time and Space Complexity
Time Complexity
The time complexity of the given code can be broken down as follows:
-
The outer loop runs
n
times wheren
is the length of the strings
. -
The inner loop, which is responsible for constructing the different XOR substrings and storing them into the dictionary
d
, runs for at most32
iterations (since we're looking at a maximum of a 32-bit integer representation). -
Within the inner loop, there is a constant-time operation of left shifting (
x << 1
) and OR-ing (| int(s[i + j])
), as well as dictionary insertion or check which can be considered O(1) on average. -
After the loops, the code iterates through each query to look up the XOR result in the dictionary, which can be done in O(1) time on average for each query.
Assuming m
queries, the total time complexity is:
O(n * 32 + m) = O(n + m)
This assumes dictionary operations take constant time.
However, we must bear in mind that if there were a lot of collisions and the Python dictionary had to handle these, the worst-case time complexity for insertions and lookups could degrade to O(n). But this is highly unlikely with a good hash function and is not considered in average case analysis.
Space Complexity
-
The space complexity mainly comes from the dictionary
d
where all unique XOR results of substrings starting at different positions are stored. -
In the worst case, every possible starting index and length could yield a unique XOR value, which means the size of
d
could grow ton * 32
. -
The space complexity also includes the output list of lists, but this only adds space linearly proportional to the number of queries
m
.
Thus, the space complexity is O(n * 32 + m) = O(n + m)
, with the same assumption about average case behavior of dictionary storage.
Learn more about how to find time and space complexity quickly using problem constraints.
Which data structure is used in a depth first search?
Recommended Readings
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Runtime Overview When learning about algorithms and data structures you'll frequently encounter the term time complexity This concept is fundamental in computer science and offers insights into how long an algorithm takes to complete given a certain input size What is Time Complexity Time complexity represents the amount of time
Want a Structured Path to Master System Design Too? Don’t Miss This!