734. Sentence Similarity
Problem Description
The problem asks to determine if two sentences are similar. Each sentence is represented as an array of words. For example, the sentence "I am happy with leetcode" can be represented as arr = ["I", "am", "happy", "with", "leetcode"]
. The words within each sentence are considered similar based on a provided list of word pairs similarPairs
, where each pair [xi, yi]
indicates that xi and yi are similar words. Two sentences are similar if they have the same number of words and each word in a position in sentence1
is similar to the corresponding word in the same position in sentence2
. A word is always considered similar to itself. It is important to note that the similarity between words is not transitive - just because word a is similar to word b and word b is similar to word c, it doesn't mean a is similar to c.
Intuition
To determine if the two sentences are similar, there are two main checks that need to be carried out:
-
Check Length: First, we need to check if both sentences have the same number of words. If they don't, we can immediately return
False
as they can't be similar with different lengths. -
Check Word Similarity: Then, we check each pair of corresponding words from
sentence1
andsentence2
. For each word pair(a, b)
we check:- If
a
is the same asb
, they are automatically similar. - If
a
andb
are found as a pair insimilarPairs
, they are similar. - As similarity is not directed (it is bidirectional), we should also check
(b, a)
insimilarPairs
.
- If
If all word pairs pass these similarity checks, the sentences are similar, and we return True
. Otherwise, at the first instance of dissimilarity, the overall similarity check fails, and we return False
.
The solution uses set comprehension to build a set s
from similarPairs
for quick lookup, and then utilizes the all
function combined with list comprehension to efficiently carry out the similarity checks for each word pair from the zipped sentences.
Solution Approach
The implemented solution takes advantage of the following algorithms, data structures, and programming constructs:
-
Set Data Structure: A set is used to store the
similarPairs
for efficient O(1) average time complexity on lookups. This is done by creating a sets
using set comprehension to hold all given similar word pairs in both directions (e.g., (a, b) and (b, a)). The nature of a set provides constant time membership testing, which is much faster compared to a list or array for this purpose. -
Zip Function: The
zip
function is a built-in Python function that aggregates elements from two or more iterables (arrays, lists, etc.) and returns an iterator of tuples. In this solution,zip
is used to create pairs of corresponding words fromsentence1
andsentence2
. -
List Comprehension with
all
Function: A list comprehension is used in conjunction with theall
function to check if all corresponding pairs of words betweensentence1
andsentence2
are similar. Theall
function returns True if all elements of the given iterable are true (or if the iterable is empty). The list comprehension iterates through each word pair and uses a logical or||
to evaluate if the words are identical, or if they exist as similar word pairs in the sets
.
Here is the step-by-step approach of the solution process:
-
Check Length Equality:
- Before iterating through the words, the solution checks if
sentence1
andsentence2
have the same length usinglen(sentence1) != len(sentence2)
. If they don't match, it returnsFalse
, thereby optimizing the solution by eliminating non-matching sentences early on.
- Before iterating through the words, the solution checks if
-
Iterative Comparison with Early Exit:
- The solution iterates over each pair of words
(a, b)
from the zipped sentences using list comprehension and checks:- If
a == b
, then the words are the same. - If
(a, b) in s
, the pair exists in the similarity set, so they are similar. - If
(b, a) in s
, since similarity is not transitive but is bidirectional, this pair is also checked in case the reverse was provided insimilarPairs
.
- If
- The
all
function wraps the list comprehension to ensure that every pair meets at least one of these conditions. If even one pair does not meet the similarity condition,all
returnsFalse
.
- The solution iterates over each pair of words
By using this approach, the solution effectively breaks down the problem into small logical checks, ensuring an efficient and clear way to compare the sentences. This results in a concise and effective code:
def areSentencesSimilar(self, sentence1: List[str], sentence2: List[str], similarPairs: List[List[str]]) -> bool:
if len(sentence1) != len(sentence2):
return False
s = {(a, b) for a, b in similarPairs}
return all(
a == b or (a, b) in s or (b, a) in s for a, b in zip(sentence1, sentence2)
)
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's consider a small example to illustrate the solution approach.
Suppose we have two sentences represented by the arrays below:
sentence1 = ["I", "love", "to", "code"]
sentence2 = ["I", "adore", "to", "program"]
And let's say we are also given the list of similar word pairs:
similarPairs = [["love", "adore"], ["code", "program"]]
When we apply our solution, this is how it will proceed:
-
Check Length Equality:
- Both
sentence1
andsentence2
have the same number of words (4 words each), so we move on to the next step. If they had a different number of words, we would returnFalse
at this point.
- Both
-
Iterative Comparison with Early Exit:
- We create an iterator of tuples by pairing up corresponding items from both sentences using
zip(sentence1, sentence2)
, which gives us[("I", "I"), ("love", "adore"), ("to", "to"), ("code", "program")]
. - We create a set from
similarPairs
to store all given similar word pairs, including their bidirectional counterparts - that would give us a sets
containing[("love", "adore"), ("adore", "love"), ("code", "program"), ("program", "code")]
. - Using list comprehension, we check each pair:
- First pair is
("I", "I")
: the words are identical, so this pair is similar. - Second pair is
("love", "adore")
: the pair exists in sets
, so this pair is also similar. - Third pair is
("to", "to")
: again, the words are identical, similar by default. - Fourth pair is
("code", "program")
: the pair exists in sets
, hence this pair is similar too.
- First pair is
- Since every pair of words from both sentences have passed the similarity check, the
all
function will ultimately returnTrue
.
- We create an iterator of tuples by pairing up corresponding items from both sentences using
Using this approach with the given sentence1
, sentence2
, and similarPairs
, our function areSentencesSimilar
would return True
because all corresponding word pairs are similar as per our rules.
Solution Implementation
1from typing import List
2
3class Solution:
4 def areSentencesSimilar(self, sentence1: List[str], sentence2: List[str], similar_pairs: List[List[str]]) -> bool:
5 # First, check if both sentences have the same number of words
6 # If not, they cannot be similar
7 if len(sentence1) != len(sentence2):
8 return False
9
10 # Create a set of pairs that are considered similar from the given list of similar pairs
11 # This allows for O(1) lookup time
12 similar_set = {(first, second) for first, second in similar_pairs}
13
14 # Check each pair of words from both sentences to see if they're the same
15 # If not, check if they are considered similar from the previously created set
16 # We check for both (a, b) and (b, a) to account for the order in which the similar pair might have been given
17 # If all pairs of words are either the same or similar (in any order), sentences are similar
18 return all(
19 word1 == word2 or (word1, word2) in similar_set or (word2, word1) in similar_set
20 for word1, word2 in zip(sentence1, sentence2)
21 )
22
23# Below this point would be code that uses the Solution class and its method,
24# usually including test cases to demonstrate the implementation.
25# Since the class itself should only contain the method definition, usage of the
26# class is not included here.
27
1class Solution {
2 // Method to determine whether two sentences are similar based on the provided similar word pairs.
3 public boolean areSentencesSimilar(
4 String[] sentence1, String[] sentence2, List<List<String>> similarPairs) {
5
6 // If the sentences have different lengths, they cannot be similar.
7 if (sentence1.length != sentence2.length) {
8 return false;
9 }
10
11 // Create a set to store the unique similar word pairs.
12 Set<String> similarPairSet = new HashSet<>();
13 for (List<String> pair : similarPairs) {
14 // Concatenate the two words with a delimiter and add to the set.
15 similarPairSet.add(pair.get(0) + "." + pair.get(1));
16 }
17
18 // Iterate through the words in both sentences.
19 for (int i = 0; i < sentence1.length; i++) {
20 String word1 = sentence1[i];
21 String word2 = sentence2[i];
22
23 // If the current words are not the same and the pair (both combinations)
24 // does not exist in the set, the sentences are not similar.
25 if (!word1.equals(word2) && !similarPairSet.contains(word1 + "." + word2)
26 && !similarPairSet.contains(word2 + "." + word1)) {
27 return false;
28 }
29 }
30
31 // If all word pairs are similar, return true.
32 return true;
33 }
34}
35
1#include <vector>
2#include <string>
3#include <unordered_set>
4
5class Solution {
6public:
7 // Function to check if two sentences are similar based on the provided similar word pairs.
8 bool areSentencesSimilar(std::vector<std::string>& sentence1, std::vector<std::string>& sentence2, std::vector<std::vector<std::string>>& similarPairs) {
9 int sentenceLength = sentence1.size(); // Storing the length of the first sentence.
10 // Check if the two sentences are of different lengths; if so, they can't be similar.
11 if (sentenceLength != sentence2.size()) return false;
12
13 // Create a set to hold the pairs as concatenated strings for quick lookup.
14 std::unordered_set<std::string> pairSet;
15 // Insert each pair of similar words into the set as 'word1.word2'.
16 for (const auto& pair : similarPairs) {
17 pairSet.insert(pair[0] + "." + pair[1]);
18 }
19
20 // Check every word in both sentences for similarity.
21 for (int i = 0; i < sentenceLength; ++i) {
22 const std::string& word1 = sentence1[i]; // Word from the first sentence.
23 const std::string& word2 = sentence2[i]; // Corresponding word from the second sentence.
24
25 // If words are not the same and neither composition of words exists in the similarity set, return false.
26 if (word1 != word2 && pairSet.find(word1 + "." + word2) == pairSet.end() && pairSet.find(word2 + "." + word1) == pairSet.end()) {
27 return false;
28 }
29 }
30
31 // If we make it through all word comparisons without returning false, the sentences are similar.
32 return true;
33 }
34};
35
1type StringPair = [string, string];
2
3// Global variables and function definitions in TypeScript
4
5// Function to check if two sentences are similar based on the provided similar word pairs.
6function areSentencesSimilar(sentence1: string[], sentence2: string[], similarPairs: StringPair[]): boolean {
7 // Storing the length of the first sentence.
8 const sentenceLength: number = sentence1.length;
9
10 // Check if the two sentences are of different lengths; if so, they can't be similar.
11 if (sentenceLength !== sentence2.length) return false;
12
13 // Create a set to hold the pairs as concatenated strings for quick lookup.
14 const pairSet: Set<string> = new Set();
15
16 // Insert each pair of similar words into the set as 'word1.word2'.
17 for (const pair of similarPairs) {
18 pairSet.add(pair[0] + "." + pair[1]);
19 }
20
21 // Check every word in both sentences for similarity.
22 for (let i = 0; i < sentenceLength; i++) {
23 const word1: string = sentence1[i]; // Word from the first sentence.
24 const word2: string = sentence2[i]; // Corresponding word from the second sentence.
25
26 // If words are not the same and neither composition of words exists in the similarity set, return false.
27 if (word1 !== word2 && !pairSet.has(word1 + "." + word2) && !pairSet.has(word2 + "." + word1)) {
28 return false;
29 }
30 }
31
32 // If we make it through all word comparisons without returning false, the sentences are similar.
33 return true;
34}
35
Time and Space Complexity
Time Complexity
The time complexity of the given code can be broken down as follows:
- Checking if
len(sentence1)
is equal tolen(sentence2)
takesO(1)
time, as it is simply comparing two integers. - Creating a set
s
from thesimilarPairs
takesO(p)
time, wherep
is the number of similar pairs. - The list comprehension iterates over each pair of words from
sentence1
andsentence2
, which occurs inO(n)
time, wheren
is the number of words in each sentence (since we've previously established the sentences are of equal length to reach this point). - Each comparison within the list comprehension is
O(1)
because checking equality of strings and checking for existence in a set are both constant-time operations.
Thus, the time complexity is O(n + p)
, where n
is the length of the sentences and p
is the number of similar word pairs.
Space Complexity
The space complexity can be analyzed as follows:
- The additional set
s
which stores the similar word pairs requiresO(p)
space, wherep
is the number of similar pairs. - The space for input variables
sentence1
,sentence2
, andsimilarPairs
is not counted as extra space as per convention since they are part of the input.
As a result, the auxiliary space complexity of the code is O(p)
.
Learn more about how to find time and space complexity quickly using problem constraints.
What data structure does Breadth-first search typically uses to store intermediate states?
Recommended Readings
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Runtime Overview When learning about algorithms and data structures you'll frequently encounter the term time complexity This concept is fundamental in computer science and offers insights into how long an algorithm takes to complete given a certain input size What is Time Complexity Time complexity represents the amount of time
Want a Structured Path to Master System Design Too? Don’t Miss This!