159. Longest Substring with At Most Two Distinct Characters
Problem Description
The problem provides us with a string s
and asks us to determine the length of the longest substring which contains at most two distinct characters. For example, in the string "aabacbebebe," the longest substring with at most two distinct characters is "cbebebe," which has a length of 7.
The challenge here is to make sure we can efficiently figure out when we have a valid substring (with 2 or fewer distinct characters) and how to effectively measure its length. We are required to iterate through our string and somehow keep track of the characters we have seen, all the while being able to update and retrieve the length of the longest qualifying substring.
Intuition
The intuition behind the solution stems from the two-pointer technique, also known as the sliding window approach. The main idea is to maintain a dynamic window of characters in the string which always satisfies the condition of having at most two distinct characters.
In this solution, we use dictionary cnt
that acts as our counter for each character within our current window and variables j
and ans
which represent the start of the window and the answer (length of longest substring), respectively.
Here's how we can logically step through the solution:
- We iterate through the string with a pointer
i
, which represents the end of our current window. - As we iterate, we update the counter
cnt
for the current character. - If ever our window contains more than two distinct characters, we need to shrink it from the start (pointer
j
). We do this inside thewhile
loop until we again have two or fewer distinct characters. - Inside the
while
loop, we also decrement the count of the character at the start of the window and if its count hits zero, we remove it from our counter dictionary. - After we ensure our window is valid, we update our answer
ans
with the maximum length found so far, which is the difference between our pointersi
andj
plus one. - Finally, we continue this process until the end of the string and return the answer.
This solution is efficient as it only requires a single pass through the string, and the operations within the sliding window are constant time on average, offering a time complexity of O(n), where n is the length of the string.
Learn more about Sliding Window patterns.
Solution Approach
The implementation includes the use of a sliding window strategy and a hash table (Counter) for tracking character frequencies. Let's go step by step through the solution:
-
Initialize the Counter and Variables:
- A
Counter
objectcnt
from thecollections
module is used to monitor the number of occurrences of each character within the current window. - Two integer variables
ans
andj
are initialized to 0.ans
will hold the final result, whilej
will indicate the start of the sliding window.
- A
-
Iterate Through the String:
- The string
s
is looped through with the variablei
acting as the end of the current window andc
as the current character in the loop. - For each character
c
, its count is incremented in theCounter
bycnt[c] += 1
.
- The string
-
Validate the Window:
- A
while
loop is utilized to reduce the window size from the left if the number of distinct characters in theCounter
is more than two. - Inside the loop:
- Decrement the count of the character at the start
s[j]
of the window. - If the count of that character becomes zero, it is removed from the Counter to reflect that it's no longer within the window.
- Increment
j
to effectively shrink the window from the left side.
- Decrement the count of the character at the start
- A
-
Update the Maximum Length Result:
- After the window is guaranteed to contain at most two distinct characters, the maximum length
ans
is updated with the length of the current valid window, calculated asi - j + 1
.
- After the window is guaranteed to contain at most two distinct characters, the maximum length
-
Return the Result:
- Once the end of the string is reached, the loop terminates, and
ans
, which now holds the length of the longest substring containing at most two distinct characters, is returned.
- Once the end of the string is reached, the loop terminates, and
This algorithm effectively maintains a dynamic window of the string, ensuring its validity with respect to the distinct character constraint and updating the longest length found. The data structure used (Counter
) immensely simplifies frequency management and contributes to the efficiency of the solution. The pattern used (sliding window) is particularly well-suited for problems involving contiguous sequences or substrings with certain constraints, as it allows for an O(n) time complexity traversal that accounts for all possible valid substrings.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's illustrate the solution approach with a smaller example using the string "aabbcca."
-
Initialize the Counter and Variables:
- Initiate
cnt
as an emptyCounter
object and variablesans
andj
to 0.
- Initiate
-
Iterate Through the String:
- Set
i
to 0 (starting at the first character 'a'). - Increment
cnt['a']
by 1. Now,cnt
has {'a': 1}.
- Set
-
Validate the Window:
- No more than two distinct characters are in the window, so we don't shrink it yet.
-
Update the Maximum Length Result:
- Set
ans
toi - j + 1
, which is0 - 0 + 1 = 1
.
- Set
Next, move i
to the next character (another 'a') and repeat the steps:
cnt['a']
becomes 2, the window still has only one distinct character.- Update
ans
toi - j + 1
, which becomes1 - 0 + 1 = 2
.
Continuing this process:
- For
i = 2
andi = 3
, the steps are similar as we continue to read 'b'.cnt
becomes {'a': 2, 'b': 2} andans
now is4
. - At
i = 4
, the first 'c' is encountered,cnt
becomes {'a': 2, 'b': 2, 'c': 1}, and nowans
is also updated to4
.
-
Validate the Window:
- With
i = 5
, we encounter the second 'c' and nowcnt
is {'a': 2, 'b': 2, 'c': 2}, but since there are already two distinct characters ('a' and 'b'), thewhile
loop will trigger to shrink the window. - In the
while
loop, we decrementcnt['a']
(as 'a' was at the start) and incrementj
; after decrementing twice, 'a' is removed fromcnt
. - Now
cnt
is {'b': 2, 'c': 2}, andj
moves past the initial 'a's, and is 2.
- With
-
Update the Maximum Length Result:
- Update
ans
toi - j + 1
, which is5 - 2 + 1 = 4
.
- Update
For the last character 'a' at i = 6
:
- Increment
cnt['a']
to 1,cnt
becomes {'b': 2, 'c': 2, 'a': 1}. - Then the
while
loop is entered and 'b's are removed similar to the 'a's before. j
is moved to 4, right after the last 'b'.- Update
ans
toi - j + 1
, which is now6 - 4 + 1 = 3
.
- Return the Result:
- The string is fully iterated, and the maximum
ans
was 4. Thus, the longest substring with at most two distinct characters is "aabb" or "bbcc", both with a length of 4.
- The string is fully iterated, and the maximum
Solution Implementation
1from collections import Counter
2
3class Solution:
4 def lengthOfLongestSubstringTwoDistinct(self, s: str) -> int:
5 # Initialize a counter to keep track of character frequencies
6 char_freq = Counter()
7
8 # Initialize the max_length to keep record of the longest substring with at most 2 distinct characters
9 max_length = 0
10
11 # Initialize the start index of the current substring with at most 2 distinct characters
12 start_index = 0
13
14 # Iterate over the string using an end_index pointer
15 for end_index, char in enumerate(s):
16 # Increment the frequency of the current character
17 char_freq[char] += 1
18
19 # If the number of distinct characters is more than 2,
20 # shrink the window from the left until we have at most 2 distinct characters
21 while len(char_freq) > 2:
22 char_freq[s[start_index]] -= 1
23 # If the count of the leftmost character is now zero, remove it from the counter
24 if char_freq[s[start_index]] == 0:
25 del char_freq[s[start_index]]
26 # Move left boundary of the window to the right
27 start_index += 1
28
29 # Update the max_length with the maximum between the current max_length and
30 # the length of the current substring with at most 2 distinct characters
31 max_length = max(max_length, end_index - start_index + 1)
32
33 # Return the maximum length of substring found
34 return max_length
35
1class Solution {
2 public int lengthOfLongestSubstringTwoDistinct(String s) {
3 // Create a HashMap to store the frequency of each character.
4 Map<Character, Integer> charFrequencyMap = new HashMap<>();
5 int length = s.length();
6 int maxLength = 0; // This will hold the length of the longest substring with at most two distinct characters.
7
8 // Two pointers defining the window of characters under consideration
9 for (int left = 0, right = 0; right < length; ++right) {
10 // Get the current character from the string.
11 char currentChar = s.charAt(right);
12 // Increase the frequency count of the character in our map.
13 charFrequencyMap.put(currentChar, charFrequencyMap.getOrDefault(currentChar, 0) + 1);
14
15 // If the map contains more than two distinct characters, shrink the window from the left
16 while (charFrequencyMap.size() > 2) {
17 char leftChar = s.charAt(left);
18 // Decrease the frequency count of this character.
19 charFrequencyMap.put(leftChar, charFrequencyMap.get(leftChar) - 1);
20 // Remove the character from the map if its count drops to zero, to maintain at most two distinct characters.
21 if (charFrequencyMap.get(leftChar) == 0) {
22 charFrequencyMap.remove(leftChar);
23 }
24 // Move the left pointer to the right
25 left++;
26 }
27
28 // Calculate the maximum length encountered so far.
29 maxLength = Math.max(maxLength, right - left + 1);
30 }
31
32 // Return the maximum length found.
33 return maxLength;
34 }
35}
36
1#include <unordered_map>
2#include <string>
3#include <algorithm>
4
5class Solution {
6public:
7 int lengthOfLongestSubstringTwoDistinct(std::string s) {
8 std::unordered_map<char, int> charFrequency; // Map to store the frequency of each character
9 int stringSize = s.size(); // Size of the input string
10 int maxLength = 0; // Variable to store the max length so far
11
12 // Two pointers technique, where 'left' is the start of the window and 'i' is the end
13 for (int left = 0, i = 0; i < stringSize; ++i) {
14 charFrequency[s[i]]++; // Increment the frequency of the current character
15
16 // If our map has more than two distinct characters, shrink the window from the left
17 while (charFrequency.size() > 2) {
18 charFrequency[s[left]]--; // Decrease the frequency of the leftmost character
19 if (charFrequency[s[left]] == 0) { // If frequency is zero, remove it from the map
20 charFrequency.erase(s[left]);
21 }
22 ++left; // Move the left boundary of the window to the right
23 }
24
25 // Calculate the current length of the substring and update the max length
26 maxLength = std::max(maxLength, i - left + 1);
27 }
28 return maxLength; // Return the length of the longest substring with at most two distinct characters
29 }
30};
31
1// Importing the Map object from the 'es6' library
2import { Map } from "es6";
3
4// Define the method lengthOfLongestSubstringTwoDistinct
5function lengthOfLongestSubstringTwoDistinct(s: string): number {
6 // Create a Map to store the frequency of each character
7 const charFrequency = new Map<string, number>();
8 // Determine the size of the input string
9 const stringSize = s.length;
10 // Variable to store the maximum length of substring found so far
11 let maxLength = 0;
12
13 // Use the two pointers technique, with 'left' as the start of the window and 'i' as the end
14 for (let left = 0, i = 0; i < stringSize; ++i) {
15 // Get the current character at index 'i'
16 const currentChar = s[i];
17 // Increment the frequency of the current character in the Map
18 const frequency = (charFrequency.get(currentChar) || 0) + 1;
19 charFrequency.set(currentChar, frequency);
20
21 // If there are more than two distinct characters, shrink the window from the left
22 while (charFrequency.size > 2) {
23 // Decrease the frequency of the character at the 'left' pointer
24 const leftChar = s[left];
25 const leftFrequency = (charFrequency.get(leftChar) || 0) - 1;
26
27 if (leftFrequency > 0) {
28 // If frequency is not zero, update it in the Map
29 charFrequency.set(leftChar, leftFrequency);
30 } else {
31 // If frequency is zero, remove it from the Map
32 charFrequency.delete(leftChar);
33 }
34 // Move the left boundary of the window to the right
35 ++left;
36 }
37
38 // Calculate the current length of the substring
39 const currentLength = i - left + 1;
40 // Update the maximum length if longer substring is found
41 maxLength = Math.max(maxLength, currentLength);
42 }
43
44 // Return the length of the longest substring with at most two distinct characters
45 return maxLength;
46}
47
48// Example usage:
49// const result = lengthOfLongestSubstringTwoDistinct("eceba");
50// console.log(result); // Output would be 3, for the substring "ece"
51
Time and Space Complexity
Time Complexity
The time complexity of the provided code is O(n)
, where n
is the length of the string s
. Here's a breakdown of the complexity:
- The
for
loop runs for every character in the string, which contributes toO(n)
. - Inside the loop, the while loop might seem to add complexity, but it will not exceed
O(n)
over the entire runtime of the algorithm, because each character is added once to the Counter and potentially removed once when the distinct character count exceeds 2. Thus, each character results in at most two operations. - The operations inside the while loop, like decrementing the Counter and conditional removal of an element from the Counter, are
O(1)
operations since Counter in Python is implemented as a dictionary.
Hence, combining these, we get a total time complexity of O(n)
.
Space Complexity
The space complexity of the code is O(k)
, where k
is the size of the distinct character set that the Counter can hold at any time. More precisely:
- Since the task is to find the longest substring with at most two distinct characters, the Counter
cnt
will hold at most 2 elements plus 1 element that will be removed once we exceed the 2 distinct characters. So in this case, k = 3. However, k is a constant here, so we often express this asO(1)
. - The variables
ans
,j
,i
, andc
are constant-size variables and do not scale with the input size, so they contributeO(1)
.
Therefore, the overall space complexity is O(1)
, since the Counter size is bounded by the small constant value which does not scale with n
.
Learn more about how to find time and space complexity quickly using problem constraints.
Which of the following is a min heap?
Recommended Readings
https algomonster s3 us east 2 amazonaws com cover_photos stack svg Sliding Window Maximum Monotonic Stack We have an array and a sliding window defined by a start index and an end index The sliding window moves from left of the array to right There are always k elements in
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Want a Structured Path to Master System Design Too? Don’t Miss This!