249. Group Shifted Strings
Problem Description
The given problem involves identifying sequences of strings that are formed by shifting characters. A string shift is where each letter of the string is moved to its next successive character in the alphabet. The task is to group all strings from a provided array that are part of the same shifting sequence. A shifting sequence is a series of strings where each string can be transformed into the next by performing this described shift operation. It's important to note that when a shift operation is applied to 'z', it wraps around to 'a'. The output can be in any order, and the aim is to group strings in the array which belong to the same shifting sequence.
For example, the string "abc" when shifted yields "bcd", and subsequently "cde", all the way up to "xyz". These shifted versions all belong to the same shifting sequence started by "abc". If "abc" and "bcd" are both in the input array, they should be grouped together in the output.
Intuition
To create a solution for grouping strings into shifting sequences, we need a way to determine if two strings belong to the same sequence. The intuition here is to "normalize" each string by shifting it back by the same amount so we can compare them. The amount to shift back by is determined by the difference between the ASCII code of the first character in the string and the ASCII code for 'a'. By doing this for every string, strings that belong to the same shifting sequence will end up looking the same after normalization.
This normalization process transforms each string into a "canonical form". For example, both "abc" and "bcd" would be normalized to "abc". These normalized strings act as keys in a dictionary (using a defaultdict for convenience) where each key maps to a list of original strings sharing the same normalized form.
Here’s the process in detail:
- Go through each string in the input array.
- Calculate the difference in ASCII values between the first character of the string and 'a'.
- Shift all characters in the string back by this calculated difference.
- If during shifting any character goes before 'a', wrap it around by adding 26 (total number of letters in the English alphabet).
- Join the shifted characters to get the normalized string.
- Add the original string to the list in the dictionary where the key is the normalized string.
- Once all strings are processed, extract all values from the dictionary. These are the groups of strings that belong to the same shifting sequence.
The provided code implements this approach, resulting in a dictionary where each entry is a list of strings that are in the same shifting sequence. These lists of strings are the solution we are looking for.
Solution Approach
The implementation of the solution involves a key algorithmic pattern known as hashing. Originally, all strings placed into a shifting sequence are hashed based on their normalized forms. The central data structure used for this is a defaultdict
from Python's collections
module, which simplifies the process of appending to lists within a dictionary without having to check if the key is already present.
Here's a step-by-step breakdown of the code implementation:
-
Initialize a
defaultdict
of lists:mp = defaultdict(list)
This creates a
mp
dictionary where each value is initialized as an empty list that keys, representing the normalized form of strings, will point to. -
Loop through each string in the provided list
strings
:for s in strings:
Each string
s
from the input list needs to be normalized. -
Calculate the difference between the ASCII value of the first character of
s
and'a'
to determine the amount to shift:diff = ord(s[0]) - ord('a')
This difference
diff
is used to shift the characters back. -
Create a new list
t
that will hold the shifted characters. -
Loop through the characters of the string
s
, shifting each bydiff
:for c in s: d = ord(c) - diff
Here, each character
c
from strings
is shifted back by the previously calculateddiff
. -
Check if the shifted ASCII value goes below
'a'
. If it does, wrap it around by adding 26 to get the correct character:if d < ord('a'): d += 26
This check makes sure the shifting maintains the correct alphabetical order.
-
Add the shifted character to the list
t
:t.append(chr(d))
By converting the ASCII value back to the character using
chr
, we get the normalized character. -
Join all the normalized characters in
t
to get the normalized form of the stringk
:k = ''.join(t)
This string
k
can now be used as a key in our dictionary. -
Append the original string
s
to the list inmp
corresponding to the keyk
:mp[k].append(s)
Strings with the same normalized form
k
will be grouped together. -
Finally, we obtain the groups of strings as lists of the dictionary's values:
return list(mp.values())
This line returns all lists from
mp
, which contains the groups of strings according to their shifting sequences.
By using a combination of hashing with normalization and a defaultdict, the solution efficiently groups the strings according to their belonging shifting sequence while maintaining an O(n * m) time complexity, where n
is the number of strings and m
is the length of the longest string.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's assume we have an array of strings: ["abc", "bcd", "acef", "xyz", "az", "ba", "a", "z"]
. According to our problem, strings that are a part of the same shifting sequence should be grouped together.
We start by initializing our defaultdict
of lists, which will hold our groups:
mp = defaultdict(list)
Next, we iterate over the given array of strings and normalize each string to find the shift sequence it belongs to:
- For the first string
"abc"
, we calculate the shift necessary to normalize it. The ASCII difference between'a'
of"abc"
and'a'
is0
. Thus, after normalization,"abc"
remains"abc"
. We append"abc"
to the list in our dictionary under the key"abc"
.
mp["abc"].append("abc")
- For the second string
"bcd"
, the shift from'b'
to'a'
is1
. Shifting each letter back by1
,"bcd"
becomes"abc"
. This string is grouped under the key"abc"
.
mp["abc"].append("bcd")
- Moving on to
"acef"
, the shift from'a'
to'a'
is again0
, so it's already normalized. We add"acef"
to a new group:
mp["acef"].append("acef")
- The fourth string
"xyz"
has a shift of23
from'x'
to'a'
. Shifting back each letter by23
, we wrap around the alphabet and"xyz"
becomes"abc"
. We add"xyz"
to the"abc"
group:
mp["abc"].append("xyz")
- For the string
"az"
, there is no uniform shift that can change'a'
to'z'
in the forward direction. But for our normalization, we only look at the first character. Since the first character is'a'
, the string"az"
stays as is. A new group is created:
mp["az"].append("az")
"ba"
is a bit more complex. The shift from'b'
to'a'
is1
. Shifting back the first 'b' gives us 'a', but for 'a', we need to wrap it around the alphabet which makes it 'z', resulting in the normalized form"az"
. Thus it's grouped with"az"
:
mp["az"].append("ba")
- The string
"a"
has no shift needed and is already normalized. It starts a new group on its own:
mp["a"].append("a")
- Finally,
"z"
needs to be shifted by25
to become"a"
, starting a new group:
mp["a"].append("z")
At the end of the iteration, we have the following groups:
{ "abc": ["abc", "bcd", "xyz"], "acef": ["acef"], "az": ["az", "ba"], "a": ["a", "z"] }
Returning all the values from our mp
dictionary gives us the required grouped strings according to their shifting sequences:
return [["abc", "bcd", "xyz"], ["acef"], ["az", "ba"], ["a", "z"]]
By following this approach, we make sure to normalize each string so it can be compared with others, and by using hashing with a defaultdict
, we efficiently group all strings into their respective shifting sequences.
Solution Implementation
1from collections import defaultdict
2
3class Solution:
4 def groupStrings(self, strings):
5 # A mapping from normalized string to the list of strings that match the pattern
6 normalized_to_group = defaultdict(list)
7
8 # Iterate over every string in the input list
9 for s in strings:
10 # List to store normalized characters
11 normalized_chars = []
12
13 # Calculate the difference between the first character and 'a'
14 shift = ord(s[0]) - ord('a')
15
16 # Normalize each character in the string
17 for char in s:
18 normalized_char_code = ord(char) - shift
19
20 # Ensure the normalized character code wraps around if it goes below 'a'
21 if normalized_char_code < ord('a'):
22 normalized_char_code += 26
23
24 # Add the normalized character to the list
25 normalized_chars.append(chr(normalized_char_code))
26
27 # Create the normalized string from the list of normalized characters
28 normalized_string = ''.join(normalized_chars)
29
30 # Add the original string to the group corresponding to the normalized string
31 normalized_to_group[normalized_string].append(s)
32
33 # Return all groups as a list of lists
34 return list(normalized_to_group.values())
35
1class Solution {
2
3 // This function groups the strings which can be obtained by shifting every character of any string in the group by the same number of positions
4 public List<List<String>> groupStrings(String[] strings) {
5 // HashMap to store groups of shifted strings
6 Map<String, List<String>> groupedStringsMap = new HashMap<>();
7
8 // Loop through each string in the array
9 for (String s : strings) {
10 // Find the shift required to bring the first character back to 'a'
11 int shift = s.charAt(0) - 'a';
12 // Convert the current string into character array for manipulation
13 char[] shiftedChars = s.toCharArray();
14
15 // Shift every character in the array such that the first character becomes 'a'
16 for (int i = 0; i < shiftedChars.length; ++i) {
17 // Calculate the shifted character considering the circular nature of the alphabet
18 char shifted = (char) (shiftedChars[i] - shift);
19 // If shifting left goes below 'a', wrap around from 'z'
20 if (shifted < 'a') {
21 shifted += 26;
22 }
23 // Update the character in the array
24 shiftedChars[i] = shifted;
25 }
26
27 // Convert the normalized character array to a string which can be used as a key in the map
28 String normalizedKey = new String(shiftedChars);
29 // Add the original string to the list corresponding to the normalized key
30 groupedStringsMap.computeIfAbsent(normalizedKey, k -> new ArrayList<>()).add(s);
31 }
32
33 // Convert the values of the hashMap into a list and return
34 return new ArrayList<>(groupedStringsMap.values());
35 }
36}
37
1#include <vector>
2#include <string>
3#include <unordered_map>
4#include <iostream>
5using namespace std;
6
7class Solution {
8public:
9 vector<vector<string>> groupStrings(vector<string>& strings) {
10 // Map to hold the grouped strings by normalized form
11 unordered_map<string, vector<string>> groupedStringsMap;
12
13 // Iterate over each string to normalize and group
14 for (auto& str : strings) {
15 // Calculate the offset to normalize the string
16 int offset = str[0] - 'a';
17 string normalizedStr = str;
18
19 // Normalize each character in the string
20 for (int i = 0; i < normalizedStr.size(); ++i) {
21 char normalizedChar = normalizedStr[i] - offset;
22 // Wrap around the alphabet if the normalized character is before 'a'
23 if (normalizedChar < 'a') {
24 normalizedChar += 26;
25 }
26 normalizedStr[i] = normalizedChar;
27 }
28
29 // For debugging: print the normalized string
30 cout << normalizedStr << endl;
31
32 // Group the original string with its normalized form
33 groupedStringsMap[normalizedStr].push_back(str);
34 }
35
36 // Placeholder for the result
37 vector<vector<string>> result;
38
39 // Build the result from the map
40 for (auto& entry : groupedStringsMap) {
41 result.push_back(entry.second);
42 }
43
44 return result;
45 }
46};
47
1// An array of strings representing the input to be grouped
2const strings: string[] = []; // Populate this array with the desired strings to be grouped
3
4// A function that calculates the difference needed to shift a character to 'a'
5function calculateOffset(char: string): number {
6 return char.charCodeAt(0) - 'a'.charCodeAt(0);
7}
8
9// A function that normalizes a string based on its first character
10function normalizeString(str: string): string {
11 let offset = calculateOffset(str[0]);
12 return str.split('').map(char => {
13 let normalizedCharCode = char.charCodeAt(0) - offset;
14 if (normalizedCharCode < 'a'.charCodeAt(0)) {
15 normalizedCharCode += 26;
16 }
17 return String.fromCharCode(normalizedCharCode);
18 }).join('');
19}
20
21// A function that groups the strings into vectors by their normalized forms
22function groupStrings(strings: string[]): string[][] {
23 const groupedStringsMap: { [key: string]: string[] } = {};
24
25 // Iterate over each string to normalize and group
26 strings.forEach(str => {
27 const normalizedStr = normalizeString(str);
28
29 // Debugging: console logging the normalized string
30 console.log(normalizedStr);
31
32 // If the normalized string is not yet a key in the map, initialize it with an empty array
33 groupedStringsMap[normalizedStr] = groupedStringsMap[normalizedStr] || [];
34 // Add the original string to the array corresponding to the normalized string
35 groupedStringsMap[normalizedStr].push(str);
36 });
37
38 // Extract the grouped strings into a result array
39 const result = Object.values(groupedStringsMap);
40
41 return result;
42}
43
44// Example usage:
45// let inputStrings = ['abc', 'bcd', 'acef', 'xyz', 'az', 'ba', 'a', 'z'];
46// let grouped = groupStrings(inputStrings);
47// console.log(grouped);
48
Time and Space Complexity
Time Complexity
The time complexity of the given code involves iterating over each string in the input list and then iterating over each character in the string.
- Let
n
be the total number of strings in the input liststrings
. - Let
k
be the average length of the strings.
For each string, we iterate through each character to compute a normalized representation by adjusting the character's ASCII values. This operation takes O(k)
time, as we have k
characters in an average string. Doing this for all strings results in O(nk)
time complexity for this step.
After we have the normalized strings, we insert them into a hashmap (mp
). The insertion into the hashmap will typically be O(1)
on average for each key-value pair, assuming that hash collisions are handled efficiently. We are inserting n
strings, so in total, this also takes O(n)
time.
Therefore, the total time complexity is O(nk) + O(n)
. Since O(nk)
will be the dominating term when k
is significant, we can consider the overall time complexity of the algorithm to be O(nk)
.
Space Complexity
For space complexity, we are using a hashmap to store the groups of strings. In the worst case, if all strings are distinct after normalization, the hashmap will have n
entries with a single string in each. Therefore, the space required for the hashmap is O(n)
.
Each string is being normalized, which may require up to k
space for each string. However, since this space is reused for each string, we don't count it n
times.
Hence, the total space complexity is O(n)
for the hashmap storage. This is the dominating term in the space complexity calculation irrespective of the average length k
of the strings.
Learn more about how to find time and space complexity quickly using problem constraints.
What data structure does Breadth-first search typically uses to store intermediate states?
Recommended Readings
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Runtime Overview When learning about algorithms and data structures you'll frequently encounter the term time complexity This concept is fundamental in computer science and offers insights into how long an algorithm takes to complete given a certain input size What is Time Complexity Time complexity represents the amount of time
Want a Structured Path to Master System Design Too? Don’t Miss This!