737. Sentence Similarity II
Problem Description
The problem provides us with two sentences, each represented as an array of words, and an array of word pairs called similarPairs
. Each pair consists of words that are considered similar to each other. The task is to determine if the two sentences are similar by adhering to two conditions:
- Both sentences must be of the same length, meaning they should contain the same number of words.
- For each corresponding pair of words from
sentence1
andsentence2
, the words must be similar. It's given that the similarity relationship is transitive. That means ifa
is similar tob
andb
is similar toc
, thena
is similar toc
.
A word is always similar to itself. The output should be true
if the sentences are similar, otherwise false
.
Flowchart Walkthrough
For analyzing Leetcode 737. Sentence Similarity II, we'll use the algorithm flowchart to guide our problem-solving strategy. Here's a step-by-step walkthrough using the provided Flowchart:
Is it a graph?
- Yes: Similarity pairs between words can be represented as edges between nodes (words) in a graph.
Is it a tree?
- No: The graph is not necessarily hierarchical, and can contain cycles due to the similarity being transitive (i.e., if A is similar to B, and B is similar to C, then A is similar to C).
Is the problem related to directed acyclic graphs (DAGs)?
- No: This problem is more about equivalence classes formed by similarity relations, rather than topological ordering or other properties specific to DAGs.
Is the problem related to shortest paths?
- No: We’re trying to determine if two sentences are similar based on connectivity between words, not finding a 'shortest path' between them.
Does the problem involve connectivity?
- Yes: The core of the problem involves checking if two words are in the same connected component or not, which translates to seeing if there's a path from one word to another in the graph.
Is the graph weighted?
- No: The problem doesn't involve any form of weights or costs associated with the edges; it only matters whether there exists a direct or indirect link between words.
Conclusion: Considering the graph nature of the problem, its unweighted characteristic, and the requirement to explore possible connectivity between words, Depth-First Search (DFS) is very well-suited for exploring each connected component. This helps in determining if the entire sentences are similar based on the transitive similarity of their constituent words.
Intuition
To solve this problem, we need to figure out a way to keep track of which words are similar to each other. This requires not just checking direct similarities, but also indirect (transitive) connections between words.
The intuition behind the solution is to use Union-Find, a data structure that efficiently handles the equivalence relations, which is precisely what we need to determine the similarity between words. The core idea of Union-Find is to group similar words together so that we can quickly find out if two words are in the same group.
Here is how we approach the solution:
- First, we need to check if both sentences are of the same length. If not, we can immediately return
false
. - Then, we initialize Union-Find. For that purpose, we use a list called
p
, which will serve as the parent pointer for each element. At the start, each element is its own parent, indicating different sets. To efficiently track the indices for each word, a dictionary namedwords
maps each word to a unique index. - We iterate through each pair in
similarPairs
and perform the union operation, merging the sets that contain the two words in each pair. - Next, we compare words in the same positions from both sentences. If they are the same, we move on to the next pair. If not, we check if they are indirectly similar through the Union-Find data structure by finding their root parents and comparing them. If they have different root parents, we conclude the sentences are not similar and return
false
. - If none of the above returns
false
, it means all words are sufficiently similar, and we returntrue
.
By using Union-Find, we can handle the transitivity of similarity and quickly query any two words to see if they are part of the same group, thus determining if the sentences are similar or not.
Learn more about Depth-First Search, Breadth-First Search and Union Find patterns.
Solution Approach
The solution to this problem uses the Union-Find algorithm. Let's break down how the code implements this algorithm and how it applies to the problem.
-
Initially, a
p
list is created, using list comprehension, withn << 1
elements wheren
is the number of pairs insimilarPairs
. The expressionn << 1
is equivalent to multiplyingn
by 2 since it shifts the bits ofn
to the left by one, effectively doubling the number. This creates a list with indices from0
to2n - 1
, and each element is initialized to its own index, representing the initial parent of itself. -
The
find
function is a recursive function that takes an indexx
and finds the root parent ofx
. It does so by checking ifp[x]
is not equal tox
, which meansx
is not its own parent and it must have a parent somewhere up in its ancestor chain. The function continues to call itself with the parent ofx
until it reaches the root parent. To optimize this process for future lookups, path compression is used by settingp[x]
to the root parent found. This means the next time we callfind
onx
, it will directly reference its root parent. -
The
words
dictionary is used to assign each unique word a unique index. When we encounter a new word insimilarPairs
, we assign it the next available index in the sequence. -
To merge two word sets, the union operation is performed by setting the parent of one representative (found by
find
) to be the representative of the other. This effectively merges the sets, making them part of a single group. -
The main logic iterates through each pair of words in
sentence1
andsentence2
. If the current words are the same, they are trivially similar, and the algorithm moves on to the next pair. Otherwise, if one of the words is not in thewords
dictionary or if the root parents of the two words differ (found viafind
), the sentences are not similar, and the algorithm returnsfalse
.
If all word pairs are found to be similar, the algorithm returns true
after iterating through all word pairs. This means that all pairs have the same root parent (are in the same set), satisfying the similarity condition according to the transitivity property of the given relationship.
By using a combination of Union-Find with path compression and an efficient mapping of words to indices, the solution effectively checks the similarity of the sentences in relatively quick operations for each word pair.
In summary, the solution approach employs the Union-Find algorithm with path compression to determine if two sentences represented by arrays of words are similar based on the similarity of word pairs provided.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's consider an example to illustrate how the Union-Find algorithm is applied to determine if two sentences are similar.
Suppose we have the following inputs:
sentence1 = ["great", "acting", "skills"]
sentence2 = ["fine", "drama", "talent"]
similarPairs = [("great", "good"), ("good", "fine"), ("acting", "drama"), ("skills", "talent")]
-
Check Length: First, we verify that both sentences are the same length. Here, both have three words, so we can proceed.
-
Initialize Union-Find: We initialize
p
with indices up to2n - 1
, wheren
is the number of unique words across allsimilarPairs
. In this case,n = 6
(since "great", "good", "fine", "acting", "drama", and "skills" are all unique), sop = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
. -
Map Words to Indices: We create a
words
dictionary where each word is mapped to a unique index. It would look like this after processing allsimilarPairs
:{ "great": 0, "good": 1, "fine": 2, "acting": 3, "drama": 4, "skills": 5, "talent": 6 }
-
Union operation: We iterate through
similarPairs
and perform union operations using the indices fromwords
dictionary. After performing unions, the representative parents might look like this:// "good" and "great" are similar. p[find(1)] = find(0) // p[1] = 0 // "good" and "fine" are similar, and "good" is linked to "great", so indirectly "fine" is linked to "great". p[find(2)] = find(1) // p[2] = 0 // "acting" and "drama" are similar. p[find(4)] = find(3) // p[4] = 3 // "skills" and "talent" are similar. p[find(6)] = find(5) // p[6] = 5
-
Comparing Sentences: Finally, we compare
sentence1
andsentence2
word by word.- For "great" and "fine", look up each word in the
words
dictionary, get their indices, and use thefind
function to check if they have the same parent. Since "fine" has been indirectly linked to "great", they have the same parent. - For "acting" and "drama", do the same, and it will reveal that they have the same parent because they have been directly linked.
- For "skills" and "talent", they have also been directly linked, so they share the same parent.
Since for all pairs of words, either they are the same word or they belong to the same set (have the same representative parent in Union-Find terminologies), the algorithm will return
true
.The execution concludes that
sentence1
andsentence2
are indeed similar. - For "great" and "fine", look up each word in the
Solution Implementation
1from typing import List
2
3class Solution:
4 def areSentencesSimilarTwo(self, sentence1: List[str], sentence2: List[str], similarPairs: List[List[str]]) -> bool:
5 # Check if sentences are of different lengths, return False directly if they are
6 if len(sentence1) != len(sentence2):
7 return False
8
9 # Initialize the parent list for union-find
10 numPairs = len(similarPairs)
11 parent = list(range(numPairs * 2))
12
13 # Union-find find function to find the root of the set to which 'x' belongs
14 def find(x):
15 if parent[x] != x:
16 parent[x] = find(parent[x])
17 return parent[x]
18
19 # Dictionary to store the mapping from word to its index in the union-find structure
20 wordsToIndex = {}
21 index = 0
22 # Iterate over similar pairs and perform union operations
23 for a, b in similarPairs:
24 # If the word 'a' is not in the dictionary, add it and assign a corresponding index
25 if a not in wordsToIndex:
26 wordsToIndex[a] = index
27 index += 1
28 # If the word 'b' is not in the dictionary, add it and assign a corresponding index
29 if b not in wordsToIndex:
30 wordsToIndex[b] = index
31 index += 1
32 # Link the indices of the two words in the parent array to denote their similarity
33 pa, pb = find(wordsToIndex[a]), find(wordsToIndex[b])
34 parent[pa] = pb
35
36 # Iterate through the sentences and check if each pair of words is similar
37 for word1, word2 in zip(sentence1, sentence2):
38 # If the words are identical, continue
39 if word1 == word2:
40 continue
41 # If either word is not in the wordsToIndex, or their roots are not the same, return False
42 if (word1 not in wordsToIndex or
43 word2 not in wordsToIndex or
44 find(wordsToIndex[word1]) != find(wordsToIndex[word2])):
45 return False
46
47 # If all word pairs are either the same or similar, return True
48 return True
49
1class Solution {
2 private int[] parents; // Array to keep track of the root parent of each element in the Union-Find data structure
3
4 // Method to check if two sentences are similar based on similar word pairs
5 public boolean areSentencesSimilarTwo(
6 String[] sentence1, String[] sentence2, List<List<String>> similarPairs) {
7
8 // Sentences must be of the same length to be similar
9 if (sentence1.length != sentence2.length) {
10 return false;
11 }
12
13 // Initialize Union-Find data structure
14 int pairCount = similarPairs.size();
15 parents = new int[pairCount << 1]; // Twice the size, as each pair contains two elements
16 for (int i = 0; i < parents.length; ++i) {
17 parents[i] = i;
18 }
19
20 // Map to store the index of each unique word
21 Map<String, Integer> wordIndexMap = new HashMap<>();
22 int index = 0;
23 // Create the connections in Union-find for each similar pair
24 for (List<String> pair : similarPairs) {
25 String word1 = pair.get(0);
26 String word2 = pair.get(1);
27
28 // If the word is not already in our map, add it with a unique index
29 if (!wordIndexMap.containsKey(word1)) {
30 wordIndexMap.put(word1, index++);
31 }
32 if (!wordIndexMap.containsKey(word2)) {
33 wordIndexMap.put(word2, index++);
34 }
35
36 // Union the two words in the Union-Find data structure
37 parents[find(wordIndexMap.get(word1))] = find(wordIndexMap.get(word2));
38 }
39
40 // Check for each pair of words in the sentences if they are similar
41 for (int i = 0; i < sentence1.length; ++i) {
42 // If words are identical, they are trivially similar
43 if (Objects.equals(sentence1[i], sentence2[i])) {
44 continue;
45 }
46 // If either word is not known, or their roots in Union-Find are different, they are not similar
47 if (!wordIndexMap.containsKey(sentence1[i]) || !wordIndexMap.containsKey(sentence2[i]) ||
48 find(wordIndexMap.get(sentence1[i])) != find(wordIndexMap.get(sentence2[i]))) {
49 return false;
50 }
51 }
52 // If all word pairs are similar, sentences are similar
53 return true;
54 }
55
56 // Helper method to find the root parent in the Union-Find structure
57 private int find(int x) {
58 if (parents[x] != x) {
59 parents[x] = find(parents[x]); // Path compression for optimization
60 }
61 return parents[x];
62 }
63}
64
1#include <vector>
2#include <string>
3#include <unordered_map>
4using namespace std;
5
6class Solution {
7public:
8 vector<int> parent;
9
10 // Helper function to perform find operation in Union-Find
11 int find(int x) {
12 if (parent[x] != x)
13 parent[x] = find(parent[x]); // Path compression
14 return parent[x];
15 }
16
17 // Main function to check if two sentences are similar
18 bool areSentencesSimilarTwo(vector<string>& sentence1, vector<string>& sentence2, vector<vector<string>>& similarPairs) {
19 // Sentences are not similar if they are of different lengths
20 if (sentence1.size() != sentence2.size())
21 return false;
22
23 // Initialize Union-Find structure
24 int n = similarPairs.size();
25 parent.resize(n << 1);
26 for (int i = 0; i < parent.size(); ++i)
27 parent[i] = i;
28
29 // Mapping each word to an index
30 unordered_map<string, int> wordIndex;
31 int idx = 0;
32 for (auto& pair : similarPairs) {
33 string word1 = pair[0], word2 = pair[1];
34 // Assign new index if word hasn't been seen before
35 if (!wordIndex.count(word1))
36 wordIndex[word1] = idx++;
37 if (!wordIndex.count(word2))
38 wordIndex[word2] = idx++;
39 // Perform union operation for each similar pair
40 parent[find(wordIndex[word1])] = find(wordIndex[word2]);
41 }
42
43 // Check if each corresponding pair of words in the two sentences are similar
44 for (int i = 0; i < sentence1.size(); ++i) {
45 if (sentence1[i] == sentence2[i]) // Words are identical
46 continue;
47
48 // If either word isn't in the Union-Find structure or they are not connected,
49 // the sentences are not similar.
50 if (!wordIndex.count(sentence1[i]) || !wordIndex.count(sentence2[i]) || find(wordIndex[sentence1[i]]) != find(wordIndex[sentence2[i]]))
51 return false;
52 }
53
54 // All checks passed, sentences are similar
55 return true;
56 }
57};
58
1import { string } from "prop-types";
2
3// Define a vector-like structure to emulate the C++ vector behavior.
4let parent: number[] = [];
5
6// Helper function to perform find operation in Union-Find
7function find(x: number): number {
8 if (parent[x] != x)
9 parent[x] = find(parent[x]); // Path compression to optimize future searches
10 return parent[x];
11}
12
13// Function to check if two sentences are similar
14function areSentencesSimilarTwo(sentence1: string[], sentence2: string[], similarPairs: [string, string][]): boolean {
15 // Sentences are not similar if they are of different lengths
16 if (sentence1.length !== sentence2.length)
17 return false;
18
19 // Initialize Union-Find structure
20 const n = similarPairs.length;
21 parent = Array.from({ length: n << 1 }, (_, i) => i);
22
23 // Mapping each word to an index
24 const wordIndex: { [word: string]: number } = {};
25 let idx = 0;
26 for (const pair of similarPairs) {
27 const [word1, word2] = pair;
28 // Assign new index if word hasn't been seen before
29 if (!(word1 in wordIndex))
30 wordIndex[word1] = idx++;
31 if (!(word2 in wordIndex))
32 wordIndex[word2] = idx++;
33 // Perform union operation for each similar pair
34 parent[find(wordIndex[word1])] = find(wordIndex[word2]);
35 }
36
37 // Check if each corresponding pair of words in the two sentences are similar
38 for (let i = 0; i < sentence1.length; i++) {
39 if (sentence1[i] === sentence2[i]) // Words are identical
40 continue;
41
42 // If either word isn't in the Union-Find structure or they are not connected,
43 // the sentences are not similar.
44 if (!(sentence1[i] in wordIndex) || !(sentence2[i] in wordIndex) || find(wordIndex[sentence1[i]]) !== find(wordIndex[sentence2[i]]))
45 return false;
46 }
47
48 // All checks passed, sentences are similar
49 return true;
50}
51
Time and Space Complexity
Time Complexity
The time complexity of the provided code is analyzed as follows:
- Initializing the parent array
p
withn << 1
elements takesO(2n)
time wheren
is the number of similar pairs. Since each pair provides two elements, we have a maximum of2n
unique words. - The
find
function is a path compression optimized union-find operation. This results in an amortized time complexity ofO(α(n))
per find operation, whereα
is the inverse Ackermann function, which is a very slow-growing function and can be considered almost constant for all practical purposes. - Building the
words
dictionary for all unique words insimilarPairs
takesO(n)
operations, as we're doing constant time insertions. - The union operations in the loop for creating the
words
dictionary will runn
times which will have a complexity ofO(nα(n))
, since each union (by thefind
operation) has a complexity ofO(α(n))
. - The for loop to compare actual sentences runs
len(sentence1)
times. In the worst case, each look-up and eachfind
operation isO(α(n))
. Since there are twofind
operations for each word-pair comparison, the total time complexity for this part isO(2mα(n))
, wherem
is the number of words in sentence1/sentence2.
Therefore, the overall time complexity of the code is:
O(2n) + O(n) + O(nα(n)) + O(2mα(n))
Simplifying, since α(n)
is a very slow-growing function and can be treated as almost constant:
O(2n + n + n + 2m)
O(4n + 2m)
The time complexity is essentially linear with respect to the total number of elements in similarPairs
and the length of the sentence1
and sentence2
. When assuming α(n)
is constant, this can be simplified further to:
O(n + m)
Space Complexity
For space complexity, the following is taken into consideration:
- The parent array
p
has a size of2n
, which isO(2n)
. - The
words
dictionary storage depends on the number of unique words acrosssimilarPairs
, so in the worst case if all words are unique, it will beO(2n)
. - The
find
function uses recursive stack space, but due to path compression, the depth will be very shallow on average, contributing insignificantly to the overall space complexity.
Hence, the overall space complexity of the program is:
O(2n) + O(2n)
Which simplifies to:
O(4n)
Considering just the unique elements count, this is:
O(n)
The space complexity is linear with respect to the number of unique elements present in similarPairs
.
Learn more about how to find time and space complexity quickly using problem constraints.
In a binary min heap, the maximum element can be found in:
Recommended Readings
https algomonster s3 us east 2 amazonaws com cover_photos dfs svg Depth First Search Prereqs Recursion Review problems recursion_intro Trees problems tree_intro With a solid understanding of recursion under our belts we are now ready to tackle one of the most useful techniques in coding interviews Depth First Search DFS
https algomonster s3 us east 2 amazonaws com cover_photos bfs svg Breadth First Search on Trees Hopefully by this time you've drunk enough DFS Kool Aid to understand its immense power and seen enough visualization to create a call stack in your mind Now let me introduce the companion spell
Union Find Disjoint Set Union Data Structure Introduction Prerequisite Depth First Search Review problems dfs_intro Once we have a strong grasp of recursion and Depth First Search we can now introduce Disjoint Set Union DSU This data structure is motivated by the following problem Suppose we have sets of elements
Want a Structured Path to Master System Design Too? Don’t Miss This!