1794. Count Pairs of Equal Substrings With Minimum Difference
Problem Description
In this problem, we are given two strings, firstString
and secondString
, and our task is to count the number of unique quadruples of indices (i,j,a,b)
that meet certain conditions. Each quadruple represents a matching substring between firstString
and secondString
with the indices fulfilling the following:
0 <= i <= j < firstString.length
, meaning thati
andj
are valid indices within thefirstString
, andi
can be equal toj
to allow for single-character substrings.0 <= a <= b < secondString.length
, indicating thata
andb
are valid indices within thesecondString
, with the same provision fora
to equalb
.- The substring from
i
toj
(inclusive) infirstString
is exactly the same as the substring froma
tob
insecondString
. j - a
should be the smallest possible difference for all such valid quadruples. This implies that we're interested in matching substrings that are closest to the start offirstString
.
The goal is to return the count of such quadruples.
Intuition
To find a solution, we need to identify all possible matching substrings between firstString
and secondString
, but with an added twist of minimizing the j - a
difference. This suggests that a brute force method of checking every possible substring between the two strings would be inefficient, particularly with large strings. Instead, we should find a way to both confirm when substrings match and also ensure we're doing so in a way that minimizes j - a
.
A key observation is that for any matching substring, the last character of the substring in secondString
must be the closest it can possibly be to the start of secondString
compared to its index in firstString
. So, for every character in firstString
, we need to find the last occurrence of that character in secondString
and track the minimum j - a
value.
The intuition behind the solution is to keep track of the closest last occurrence of each character from secondString
as we iterate through firstString
. We maintain a dictionary, last
, that maps each character to its last occurrence in secondString
. As we go through each character c
in firstString
, we calculate the difference between the current index in firstString
and the index of the last occurrence of c
in secondString
. If the current difference is less than the minimum difference found so far (mi
), we update mi
and reset the count of valid quadruples to 1, as this is the new minimum difference. If we encounter the same minimum difference again, we increment our count.
By tracking the minimum difference as we iterate through firstString
and updating when necessary, we are able to tally the number of quadruples efficiently and ensure that the condition j - a
is minimized.
Learn more about Greedy patterns.
Solution Approach
The solution uses a dictionary to store the mapping of characters to their last occurrence in secondString
. This is crucial because the conditions specify that we want to match substrings and minimize j - a
, which inherently requires us to know the last (closest to the end) index of a character.
The steps of the implementation are as follows:
-
We first initialize the dictionary
last
to map each characterc
insecondString
to the indexi
of its last occurrence, as given by{c: i for i, c in enumerate(secondString)}
. This operation happens once and is an O(n) operation where n is the length ofsecondString
. -
We set
ans
to 0, which will keep the count of the number of valid quadruples, andmi
toinf
(infinity), which is a placeholder for the minimum value ofj - a
we have seen so far. -
We then iterate through each character
c
and its indexi
infirstString
, checking ifc
is present inlast
. If it is not, then there's no matching substring ending with that character, and we move on. -
When we do find a matching character, we calculate the difference
t = i - last[c]
, which representsj - a
. -
We then compare
t
withmi
. Ift
is smaller, it means we have found a closer match in terms of the indices, and we updatemi
tot
and resetans
to 1 because we've found a new minimum. Ift
is equal tomi
, it means we've found another quadruple with the same minimum difference, so we incrementans
by 1. -
Finally, we return the value of
ans
, which is the total count of such quadruples after iterating through the entirefirstString
.
This approach is efficient as it only requires O(n + m) time complexity, where n is the length of firstString
and m is the length of secondString
, because we go through each string only once. It avoids the much less efficient O(nmmin(n,m)) time complexity that would be required if we were to naively look for matching substrings with a nested loop and substring comparisons.
A key algorithmic principle at play here is the use of a "greedy" strategy that aims to always choose the closest last occurrence of a character when considering matching substrings, ensuring the smallest possible j - a
as required by the problem.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's illustrate the solution approach with a small example:
Suppose we have firstString = "abac"
and secondString = "cabba"
. We want to find the number of unique quadruples where the substrings match and have the smallest j - a
difference.
-
We start by creating the
last
dictionary mapping each character insecondString
to its last occurrence:{'c': 0, 'a': 4, 'b': 3}
. -
We set
ans
to 0 andmi
(minimum difference) to infinity. -
Iterate through each character in
firstString
:- For
i = 0
withc = 'a'
,last['a']
is 4, sot = 0 - 4 = -4
. Sincet
is less thanmi
, we setmi
to -4 andans
to 1. - For
i = 1
withc = 'b'
, there is no 'b' inlast
, so we move on. - For
i = 2
withc = 'a'
,last['a']
is still 4, sot = 2 - 4 = -2
.t
is greater thanmi
, so we ignore this. - For
i = 3
withc = 'c'
,last['c']
is 0, sot = 3 - 0 = 3
.t
is greater thanmi
, so again, we ignore this.
- For
-
After iteration, we find that
ans
is 1, which indicates there is only one quadruple satisfying the conditions:(0, 0, 4, 4)
, representing the substring 'a' from both strings.
Thus, the function would return 1 as the count of such quadruples.
By following this approach, we efficiently calculated the result without having to compare each possible substring pair, saving computation time and adhering to the problem's constraints.
Solution Implementation
1class Solution:
2 def countQuadruples(self, first_string: str, second_string: str) -> int:
3 # Create a dictionary to map each character in the second string to the index of its last occurrence.
4 last_occurrence_index = {char: index for index, char in enumerate(second_string)}
5
6 # Initialize answer to 0, and minimum difference as infinity (a large number).
7 answer = 0
8 minimum_difference = float('inf') # 'inf' represents infinity in Python.
9
10 # Loop over each character and its index in the first string.
11 for index, char in enumerate(first_string):
12 # Check if the current character exists in the second string.
13 if char in last_occurrence_index:
14 # Calculate the difference between the index of the character in
15 # the first string and the last occurrence index in the second string.
16 difference = index - last_occurrence_index[char]
17
18 # If the current difference is less than the minimum difference recorded:
19 if minimum_difference > difference:
20 # Update minimum difference and reset answer to 1.
21 minimum_difference = difference
22 answer = 1
23 # If the current difference matches the minimum difference:
24 elif minimum_difference == difference:
25 # Increment the answer by 1.
26 answer += 1
27
28 # Return the final count of quadruples.
29 return answer
30
1class Solution {
2 public int countQuadruples(String firstString, String secondString) {
3 // Initialize an array to store the last occurrence index of each character
4 int[] lastOccurrence = new int[26];
5 // Traverse through the second string to fill the last occurrence array
6 for (int i = 0; i < secondString.length(); ++i) {
7 lastOccurrence[secondString.charAt(i) - 'a'] = i + 1;
8 }
9
10 int count = 0; // Counter for the number of quadruples
11 int minDifference = Integer.MAX_VALUE; // Variable to track the minimum difference
12
13 // Traverse through the first string to find valid quadruples
14 for (int i = 0; i < firstString.length(); ++i) {
15 // Get the last occurrence of the current character from the first string in the second string
16 int j = lastOccurrence[firstString.charAt(i) - 'a'];
17 // Ensure that the character also occurs in the second string
18 if (j > 0) {
19 int difference = i - j; // Calculate the difference
20 // If the difference is less than the current minimum, update the minimum and reset the count
21 if (minDifference > difference) {
22 minDifference = difference;
23 count = 1;
24 } else if (minDifference == difference) {
25 // If the difference is the same as the current minimum, increment the count
26 ++count;
27 }
28 }
29 }
30 // Return the count of quadruples
31 return count;
32 }
33}
34
1#include <string>
2#include <vector>
3#include <climits>
4
5class Solution {
6public:
7 // Count the number of quadruples where the last occurrence of a character in the first string
8 // comes before the last occurrence of the same character in the second string
9 int countQuadruples(string firstString, string secondString) {
10 // Initialize an array to store the last occurrence indices for each character in the second string
11 // Note that the array is initialized with zeros (0) which signifies that the character hasn't been found yet.
12 int lastOccurrence[26] = {};
13
14 // Find the last occurrence of each character in the second string
15 for (int i = 0; i < secondString.size(); ++i) {
16 // Using character 'a' as base index 0 for 'a' to 'z' as 0 to 25
17 lastOccurrence[secondString[i] - 'a'] = i + 1; // Store index + 1 to differentiate between found and not found.
18 }
19
20 int count = 0; // Will hold the final count of quadruples
21 int minDifference = INT_MAX; // Start with the maximum value as the difference to find the minimum
22
23 // Iterate through the first string to find the quadruples
24 for (int i = 0; i < firstString.size(); ++i) {
25 int currentIndex = lastOccurrence[firstString[i] - 'a']; // Get the last occurrence index from the second string
26
27 // If the character is present in the second string
28 if (currentIndex) {
29 int difference = i - (currentIndex - 1); // Calculate the difference i - j
30
31 // If a new minimum difference is found, update 'minDifference' and reset count to 1
32 if (minDifference > difference) {
33 minDifference = difference;
34 count = 1;
35 }
36 // If the same minimum difference is found again, increment count
37 else if (minDifference == difference) {
38 ++count;
39 }
40 }
41 }
42
43 // Return the total count of quadruples found
44 return count;
45 }
46};
47
1// The function takes two strings and counts the number of quadruples where the last occurrence of a
2// character in the first string comes before the last occurrence of the same character in the second string.
3function countQuadruples(firstString: string, secondString: string): number {
4 // An array to keep track of the last occurrence indices for each letter in the second string.
5 // A value of 0 indicates the character has not been found yet.
6 let lastOccurrence: number[] = new Array(26).fill(0);
7
8 // Find the last occurrence index of each character in the second string.
9 for (let i = 0; i < secondString.length; i++) {
10 // Convert the character to an index from 0 to 25 (for 'a' to 'z').
11 let charIndex = secondString.charCodeAt(i) - 'a'.charCodeAt(0);
12 // Store the last occurrence index of the current character + 1 (to differentiate between found and not found).
13 lastOccurrence[charIndex] = i + 1;
14 }
15
16 let count = 0; // Used to keep track of the number of quadruples found.
17 let minDifference = Number.MAX_SAFE_INTEGER; // Initialize with the largest safe integer value as a starting point.
18
19 // Iterate through each character in the first string to find quadruples.
20 for (let i = 0; i < firstString.length; i++) {
21 // Get the index of the last occurrence of the current character from the second string.
22 let charIndex = firstString.charCodeAt(i) - 'a'.charCodeAt(0);
23 let currentIndex = lastOccurrence[charIndex];
24
25 // Check if the current character is present in the second string.
26 if (currentIndex) {
27 // Calculate the difference between indices (i - j).
28 let difference = i - (currentIndex - 1);
29
30 // Update 'minDifference' and reset 'count' if a new minimum difference is found.
31 if (minDifference > difference) {
32 minDifference = difference;
33 count = 1;
34 }
35 // If the same minimum difference is found, increment the 'count'.
36 else if (minDifference === difference) {
37 count++;
38 }
39 }
40 }
41
42 // Return the final count of quadruples found.
43 return count;
44}
45
Time and Space Complexity
The provided Python code defined within the Solution
class is designed to find the number of times a minimal difference between indices of matching characters from firstString
and secondString
occurs. The time complexity and space complexity of this code are as follows:
Time Complexity
The time complexity of this code can be split into two major components:
-
Building the
last
dictionary, where the last occurrence of each character fromsecondString
is stored with the character as the key and its index as the value. In the worst case,enumerate(secondString)
will iterate through all characters ofsecondString
, which takesO(n)
, wheren
refers to the length ofsecondString
. -
Iterating over
firstString
to calculate the differences and counting the minimal differences. This is another loop that goes through all the characters infirstString
, which, in the worst case, results inO(m)
, wherem
refers to the length offirstString
.
Since these operations are sequential, the overall time complexity is the sum of the two individual complexities:
O(n) + O(m)
. Since these two strings are independent, simplifying the expression does not combine the terms, and the final time complexity is O(m + n)
.
Space Complexity
The space complexity is determined by the additional space required by the algorithm which is not part of the input or the output. In this case, it's:
-
The space used by the
last
dictionary which, in the worst case, contains an entry for every unique character insecondString
, and since there is a fixed limit to the character set (in the case of ASCII, a maximum of 128 characters), the space taken bylast
could be consideredO(1)
. -
The two variables
ans
andmi
occupy constant spaceO(1)
.
Therefore, the space complexity is the larger of the space used by the last
dictionary and the space for the variables ans
and mi
, which results in O(1)
space complexity.
Overall, the provided code has a time complexity of O(m + n)
and a space complexity of O(1)
.
Learn more about how to find time and space complexity quickly using problem constraints.
Which algorithm should you use to find a node that is close to the root of the tree?
Recommended Readings
Greedy Introduction div class responsive iframe iframe src https www youtube com embed WTslqPbj7I title YouTube video player frameborder 0 allow accelerometer autoplay clipboard write encrypted media gyroscope picture in picture web share allowfullscreen iframe div When do we use greedy Greedy algorithms tend to solve optimization problems Typically they will ask you to calculate the max min of some value Commonly you may see this phrased in the problem as max min longest shortest largest smallest etc These keywords can be identified by just scanning
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Want a Structured Path to Master System Design Too? Don’t Miss This!