2977. Minimum Cost to Convert String II
Problem Description
In this problem, we are given two strings, source
and target
, both of 0-indexed length n
comprising lowercase English characters. We also have two arrays of strings, original
and changed
, and an array of integers cost
. Each element within the cost
array stands for the expense of altering a string in original
to its corresponding string in changed
.
The challenge lies in transforming source
into target
at the minimum possible overall cost, adhering to certain transformation rules. We may replace any substring from source
with a new substring (if there's a matching transformation in original
, changed
, and corresponding cost). The rules for performing transformations are:
- The substrings selected for two distinct operations should not overlap unless they are precisely the same.
- The substrings must either be disjoint or identical.
Our goal is to ascertain the lowest price to turn source
into target
. It may be that no sequence of operations can achieve the transformation, in which case we must return -1
.
Intuition
The core intuition for solving this problem is to treat the problem as a graph problem where strings represent nodes and conversion costs represent edges. We begin by creating a Trie structure to efficiently store all the source and destination strings (original
and changed
) as nodes with unique integer identifiers. This helps us to quickly look up any substring's corresponding transformation node.
We also initialize a graph, represented as a two-dimensional array g
, keeping track of the minimum cost to convert each possible string original[i]
to changed[i]
. This graph will be used in applying the Floyd-Warshall algorithm to find out the shortest path, or in our case, the minimum cost to transform any string to another given all the intermediate conversion possibilities provided by original
and changed
.
Next comes the Floyd-Warshall algorithm—the classic algorithm for finding shortest paths in a weighted graph with positive or negative edge weights. Here the algorithm helps precompute the minimum cost to convert any original[i]
to changed[i]
, considering intermediate conversions that might lower the total cost.
Once we have precomputed the conversion costs between any two substrings, we use Depth-First Search (DFS) along with memoization to recursively find the minimum cost needed to transform the source
into the target
, starting from the first character. Memoization helps to store the results of subproblems and avoid redundancy, speeding up the computation.
The DFS function works by comparing the current characters of source
and target
. If they are equal, we can continue to the next character without any cost. If they differ, DFS explores multiple scenarios where the current substring of source
is converted to all possible substrings of target
, each with their associated cost, and then adds the result of the recursive call for the rest of the string. The minimum cost out of these scenarios is selected.
Finally, it all boils down to calling our DFS starting from the beginning of the source
string and using the precomputed graph to efficiently manage transformations, with the minimum cost being retrieved from the memoized search.
Learn more about Graph, Trie, Dynamic Programming and Shortest Path patterns.
Solution Approach
The solution to this problem leverages a combination of data structures and algorithms to achieve its goal. Here's a step-by-step breakdown of the approach:
-
Trie Construction: We utilize a Trie to store all the
original
andchanged
strings. A Trie is an efficient data structure for storing a dynamic set of strings, which allows for quick lookup operations. In this case, the Trie helps us map everyoriginal
string to a unique integer identifier and, similarly, everychanged
string to another identifier. -
Graph Initialization: We then initialize a graph
g
as a two-dimensional array with dimensionsm x m
, wherem
is the count of all unique strings in the Trie. The graph contains information about the conversion costs, initialized to infinity (inf
), except for self-transitions, which have a cost of0
. -
Floyd-Warshall Algorithm: This classic algorithm is used to compute the shortest paths between all pairs of vertices in the graph. Applied here, it calculates the minimum cost to convert any string
i
to stringj
, factoring in the possibility of intermediate string transformations that could reduce the total cost. The update rule for Floyd-Warshall is often formulated as:`g[i][j] = min(g[i][j], g[i][k] + g[k][j])`
This implies that the direct cost to convert from
i
toj
is compared with the cost to convert fromi
tok
and then fromk
toj
, for each intermediate nodek
. We update the cost if the latter path is cheaper. -
Depth-First Search with Memoization: We define function
dfs(i)
to find the minimum cost of convertingsource[i..]
totarget[i..]
, starting from indexi
. If at any stepsource[i]
equalstarget[i]
, no cost incurs and the DFS continues to the next index. When they differ, we explore all possible conversions of substrings starting fromi
, considering the precomputed conversion costs. The functiondfs
utilizes memoization to store results of subproblems to avoid repeating calculations. -
Enumerating Substring Conversions: During the DFS, we consider every substring of
source
starting at indexi
:source[i..j]
for allj >= i
. For each substring, we find corresponding node identifiers in the Trie—single quotes denote code
. The conversion cost from the current source substring to the analogous target substring isg[p.v][q.v]
, wherep
andq
represent the nodes in the Trie for the source and target substrings. We then combine this cost with the atdfs(j + 1)
, the DFS for the remaining substring ofsource
starting atj + 1
. -
Final Answer: The minimal cost to transform the entire
source
totarget
is given by callingdfs(0)
, the DFS starting from the first index ofsource
. If the resulting cost is infinite (inf
), it means there is no way to fully convertsource
totarget
, and thus we return-1
. Otherwise, we return the computed minimum cost.
Put simply, the solution consolidates the precalculated costs of substring conversions and deploys DFS to recursively find the minimum cost from source
to target
, significantly aided by the memoization technique to optimize the recursion.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through a small example to illustrate the solution approach. Suppose we have the following inputs:
source
= "abcd"target
= "bdce"original
= ["a", "bc", "d"]changed
= ["b", "f", "e"]cost
= [1, 2, 1]
Constructing the Trie (Step 1)
We begin by inserting all original
and changed
strings into a Trie to efficiently organize the transformations. We'll have nodes with identifiers (for simplicity, a
, bc
, and d
will have identifiers 1, 2, 3, and b
, f
, and e
will have identifiers 4, 5, 6, respectively).
Initializing the Graph (Step 2)
Next, we create a graph g
with dimensions 6 x 6
as there are six unique strings (nodes) in the Trie.
We initialize the graph with:
g[i][i] = 0
for alli
representative of self-conversions which have no cost.g[i][j] = inf
for other elements, as we have not yet established conversion costs.
Updating the graph with the given costs gives us:
g[1][4] = 1
(cost of changing "a" to "b")g[2][5] = 2
(cost of changing "bc" to "f")g[3][6] = 1
(cost of changing "d" to "e")
Applying Floyd-Warshall (Step 3)
We execute the Floyd-Warshall algorithm to precompute the minimum costs for converting any original
string to changed
accounting for intermediate transformations.
Since our graph already reflects the direct costs and there are no cheaper intermediate nodes to connect them (the graph has no additional edges), the graph remains unchanged after Floyd-Warshall in this small example.
Depth-First Search (DFS) with Memoization (Step 4 & 5)
We call the dfs
function starting at index 0
. Let's examine possible conversions at each step:
-
dfs(0)
: "abcd" -> "bdce"- Converting "a" (source[0]) to "b" has a cost of 1 (g[1][4]).
- After this operation, "bcd" must be converted to "dce".
- Call
dfs(1)
, which will process "bcd".
-
dfs(1)
: "bcd" -> "dce"- Converting "bc" to "f" does not help us in this case as "f" is not a substring of "dce".
- Converting "b" to "d" is not possible with our given transformations.
- Therefore, we look at converting "c" ("b" is skipped as there is no change that matches).
- There is no direct conversion available from "c" to "d", we skip to the next character.
-
dfs(2)
: "d" -> "ce"- Converting "d" to "e" has a cost of 1 (g[3][6]).
- After this operation, nothing is left from "source", but we have an extra "c" in the target that cannot be achieved, so this path is not possible.
Since none of the transformations satisfy the complete conversion from "abcd" to "bdce", we reach a dead-end.
Results (Step 6)
After exploring all possible conversions via DFS, we see that:
- The cost of transforming "a" to "b" is
1
. - There is no cost-effective way to transform "bcd" to "dce", leading to an infinite cost.
Hence, the result is -1
, indicating that it is not possible to transform source
to target
using the given transformations and costs.
Solution Implementation
1from typing import List
2from functools import cache
3from math import inf
4
5
6class Node:
7 __slots__ = ["children", "value"]
8
9 def __init__(self):
10 # Each Node has up to 26 children for each letter of the alphabet and a value.
11 self.children: List[Node | None] = [None] * 26
12 self.value = -1
13
14
15class Solution:
16 def minimumCost(self, source: str, target: str, original: List[str], changed: List[str], cost: List[int]) -> int:
17 m = len(cost)
18 # Initialize a graph with infinite edge costs.
19 graph = [[inf] * (m << 1) for _ in range(m << 1)]
20
21 for i in range(m << 1):
22 graph[i][i] = 0
23
24 root = Node()
25 index = 0
26
27 def insert(word: str) -> int:
28 # Insert a word into the trie and return its index.
29 node = root
30 for char in word:
31 i = ord(char) - ord('a')
32 if node.children[i] is None:
33 node.children[i] = Node()
34 node = node.children[i]
35 if node.value < 0:
36 nonlocal index
37 node.value = index
38 index += 1
39 return node.value
40
41 @cache
42 def dfs(i: int) -> int:
43 # Depth-first search to find the minimum cost to transform source[i:] to target[i:].
44 if i >= len(source):
45 return 0
46 cost_if_same = dfs(i + 1) if source[i] == target[i] else inf
47
48 p = q = root
49 minimum_cost = cost_if_same
50
51 for j in range(i, len(source)):
52 if p is None or q is None:
53 break
54 p = p.children[ord(source[j]) - ord('a')]
55 q = q.children[ord(target[j]) - ord('a')]
56
57 if p is None or q is None or p.value < 0 or q.value < 0:
58 continue
59
60 minimum_cost = min(minimum_cost, dfs(j + 1) + graph[p.value][q.value])
61 return minimum_cost
62
63 # Build the initial graph using original and changed words, and their associated cost.
64 for x, y, z in zip(original, changed, cost):
65 x = insert(x)
66 y = insert(y)
67 graph[x][y] = min(graph[x][y], z)
68
69 # Use Floyd–Warshall algorithm to find the shortest paths in graph.
70 for k in range(index):
71 for i in range(index):
72 if graph[i][k] >= inf: # Skip if the edge does not exist.
73 continue
74 for j in range(index):
75 if graph[i][k] + graph[k][j] < graph[i][j]:
76 graph[i][j] = graph[i][k] + graph[k][j]
77
78 # Calculate the minimum cost to transform the entire source to the target.
79 answer = dfs(0)
80
81 # Return -1 if it's not possible or the answer if it's valid.
82 return -1 if answer >= inf else answer
83
1import java.util.Arrays;
2
3class Node {
4 Node[] children = new Node[26];
5 int valueIndex = -1; // Initialize with -1 to indicate not assigned.
6}
7
8class Solution {
9 private final long INF = 1L << 60; // A large value to represent infinity in the context.
10 private Node root = new Node(); // Trie root node.
11 private int index; // Tracks indices for the nodes in the trie.
12
13 private long[][] graph; // Represents the cost graph between trie node indices.
14 private char[] sourceArray; // Source string to character array.
15 private char[] targetArray; // Target string to character array.
16 private Long[] dpMemo; // Dynamic programming memoization table.
17
18 // Computes the minimum cost to convert the `source` string to the `target` string.
19 public long minimumCost(
20 String source, String target, String[] originals, String[] changeds, int[] costs) {
21 int transformationCount = costs.length;
22 graph = new long[transformationCount << 1][transformationCount << 1];
23 sourceArray = source.toCharArray();
24 targetArray = target.toCharArray();
25
26 // Initialize graph with INF and 0 on the diagonal (no cost to convert to itself).
27 for (int i = 0; i < graph.length; ++i) {
28 Arrays.fill(graph[i], INF);
29 graph[i][i] = 0;
30 }
31
32 // Build the trie and graph with minimum costs between conversions.
33 for (int i = 0; i < transformationCount; ++i) {
34 int startIndex = insertIntoTrie(originals[i]);
35 int endIndex = insertIntoTrie(changeds[i]);
36 graph[startIndex][endIndex] = Math.min(graph[startIndex][endIndex], costs[i]);
37 }
38
39 // Floyd-Warshall algorithm to find all-pairs shortest paths in the graph.
40 for (int k = 0; k < index; ++k) {
41 for (int i = 0; i < index; ++i) {
42 if (graph[i][k] >= INF) {
43 continue; // Skip if there is no path from i to k.
44 }
45 for (int j = 0; j < index; ++j) {
46 graph[i][j] = Math.min(graph[i][j], graph[i][k] + graph[k][j]);
47 }
48 }
49 }
50
51 // Initialize dynamic programming memoization table.
52 dpMemo = new Long[sourceArray.length];
53 // Start the recursive depth-first search from index 0.
54 long answer = dfs(0);
55 // If the result is INF, no valid conversion exists; return -1.
56 return answer >= INF ? -1 : answer;
57 }
58
59 // Inserts a string into the trie and returns its index (valueIndex).
60 private int insertIntoTrie(String word) {
61 Node node = root;
62 for (char c : word.toCharArray()) {
63 int index = c - 'a';
64 if (node.children[index] == null) {
65 node.children[index] = new Node();
66 }
67 node = node.children[index];
68 }
69 if (node.valueIndex < 0) {
70 node.valueIndex = index++; // Assign index if not already assigned.
71 }
72 return node.valueIndex;
73 }
74
75 // Recursive depth-first search function to compute minimum cost.
76 private long dfs(int currentIndex) {
77 // Base case: If we've reached the end of the source array, no cost needed.
78 if (currentIndex >= sourceArray.length) {
79 return 0;
80 }
81 // Retrieve precomputed result from memo table if it exists.
82 if (dpMemo[currentIndex] != null) {
83 return dpMemo[currentIndex];
84 }
85 // Compare characters at the current index of source and target strings.
86 long result = sourceArray[currentIndex] == targetArray[currentIndex] ? dfs(currentIndex + 1) : INF;
87
88 // Search for the cheapest conversion from currentIndex to the end of strings.
89 Node sourceNode = root, targetNode = root;
90 for (int j = currentIndex; j < sourceArray.length; ++j) {
91 sourceNode = sourceNode.children[sourceArray[j] - 'a'];
92 targetNode = targetNode.children[targetArray[j] - 'a'];
93 if (sourceNode == null || targetNode == null) {
94 break; // No possible conversion if either nodes are null.
95 }
96 if (sourceNode.valueIndex < 0 || targetNode.valueIndex < 0) {
97 continue; // If no index assigned to nodes, skip.
98 }
99 long transformationCost = graph[sourceNode.valueIndex][targetNode.valueIndex];
100 if (transformationCost < INF) {
101 result = Math.min(result, transformationCost + dfs(j + 1));
102 }
103 }
104 // Store the result in memoization table and return.
105 return dpMemo[currentIndex] = result;
106 }
107}
108
1class Node {
2public:
3 Node* children[26]; // Pointers to child Nodes, one for each letter in the alphabet
4 int value = -1; // Unique identifier for the node, -1 means uninitialized
5 Node() {
6 fill(begin(children), end(children), nullptr);
7 }
8};
9
10class Solution {
11private:
12 const long long kInf = 1LL << 60; // Represents an unreachable state or infinity
13 Node* root = new Node(); // Root of the trie
14 int index; // Index to assign unique IDs to trie nodes
15
16 vector<vector<long long>> graph; // Representation of the transformation costs as a graph
17 string sourceString; // Source string to transform from
18 string targetString; // Target string to transform to
19 vector<long long> memoizationCache; // Memoization cache to store results of subproblems
20
21public:
22 long long minimumCost(string source, string target, vector<string>& original, vector<string>& changed, vector<int>& cost) {
23 int modificationCount = cost.size();
24 graph = vector<vector<long long>>(modificationCount << 1, vector<long long>(modificationCount << 1, kInf));
25 sourceString = source;
26 targetString = target;
27
28 // Set self-transformation cost to 0
29 for (int i = 0; i < graph.size(); ++i) {
30 graph[i][i] = 0;
31 }
32
33 // Build the graph with transformation costs
34 for (int i = 0; i < modificationCount; ++i) {
35 int fromIndex = insert(original[i]);
36 int toIndex = insert(changed[i]);
37 graph[fromIndex][toIndex] = min(graph[fromIndex][toIndex], static_cast<long long>(cost[i]));
38 }
39
40 // Floyd-Warshall algorithm to find all pairs shortest paths
41 for (int k = 0; k < index; ++k) {
42 for (int i = 0; i < index; ++i) {
43 if (graph[i][k] >= kInf) continue; // Skip over infinite costs
44 for (int j = 0; j < index; ++j) {
45 graph[i][j] = min(graph[i][j], graph[i][k] + graph[k][j]);
46 }
47 }
48 }
49
50 // Initialize memoization cache for DFS
51 memoizationCache = vector<long long>(sourceString.length(), -1);
52 long long answer = depthFirstSearch(0);
53 // Return -1 if it is not possible to transform source to target string within given costs
54 return answer >= kInf ? -1 : answer;
55 }
56
57private:
58 // Inserts a word into the trie and returns its index
59 int insert(const string& word) {
60 Node* node = root;
61 for (char c : word) {
62 int i = c - 'a';
63 if (node->children[i] == nullptr) {
64 node->children[i] = new Node();
65 }
66 node = node->children[i];
67 }
68 if (node->value < 0) {
69 node->value = index++;
70 }
71 return node->value;
72 }
73
74 // DFS to compute the minimum cost of transforming the source substring starting at index `i`
75 long long depthFirstSearch(int i) {
76 if (i >= sourceString.length()) {
77 return 0;
78 }
79 if (memoizationCache[i] != -1) {
80 return memoizationCache[i];
81 }
82 long long result = (sourceString[i] == targetString[i]) ? depthFirstSearch(i + 1) : kInf;
83 Node* pNode = root;
84 Node* qNode = root;
85 for (int j = i; j < sourceString.length(); ++j) {
86 pNode = pNode->children[sourceString[j] - 'a'];
87 qNode = qNode->children[targetString[j] - 'a'];
88 if (pNode == nullptr || qNode == nullptr) {
89 break;
90 }
91 if (pNode->value < 0 || qNode->value < 0) {
92 continue;
93 }
94 long long tempCost = graph[pNode->value][qNode->value];
95 if (tempCost < kInf) {
96 result = min(result, tempCost + depthFirstSearch(j + 1));
97 }
98 }
99 return memoizationCache[i] = result;
100 }
101};
102
1// Declaration for the Node type
2type Node = {
3 children: (Node | null)[];
4 value: number;
5};
6
7// Function to create a new Node with initialized values
8const createNode = (): Node => ({
9 children: Array(26).fill(null),
10 value: -1
11});
12
13// The global "root" node of the trie
14const root: Node = createNode();
15
16// Mapping from index to nodes
17let index: number = 0;
18
19// Memoization array for the DFS
20const memo: number[] = Array(100).fill(-1);
21
22// Function to insert a word into the trie and return the corresponding node value
23const insertWord = (word: string): number => {
24 let node: Node = root;
25 for (const char of word) {
26 const charIndex: number = char.charCodeAt(0) - 'a'.charCodeAt(0);
27 if (node.children[charIndex] === null) {
28 node.children[charIndex] = createNode();
29 }
30 node = node.children[charIndex]!;
31 }
32 if (node.value < 0) {
33 node.value = index++;
34 }
35 return node.value;
36};
37
38// The main minimumCost function definition
39function minimumCost(
40 source: string,
41 target: string,
42 original: string[],
43 changed: string[],
44 cost: number[],
45): number {
46 const originalLength = cost.length;
47 const sourceLength = source.length;
48 const graph: number[][] = Array.from({ length: originalLength << 1 }, () => Array(originalLength << 1).fill(Infinity));
49
50 // Fill the graph based on original and changed strings and their associated costs
51 for (let i = 0; i < originalLength; ++i) {
52 const fromNode: number = insertWord(original[i]);
53 const toNode: number = insertWord(changed[i]);
54 graph[fromNode][toNode] = Math.min(graph[fromNode][toNode], cost[i]);
55 }
56
57 // Apply the Floyd-Warshall algorithm to find all pairs shortest paths
58 for (let k = 0; k < index; ++k) {
59 for (let i = 0; i < index; ++i) {
60 for (let j = 0; j < index; ++j) {
61 graph[i][j] = Math.min(graph[i][j], graph[i][k] + graph[k][j]);
62 }
63 }
64 }
65
66 // The DFS function to be used in the minimumCost function
67 const depthFirstSearch = (i: number): number => {
68 if (i >= sourceLength) {
69 return 0;
70 }
71 if (memo[i] !== -1) {
72 return memo[i];
73 }
74 let result: number = source[i] === target[i] ? depthFirstSearch(i + 1) : Infinity;
75 let p: Node = root;
76 let q: Node = root;
77 for (let j = i; j < source.length; ++j) {
78 p = p.children[source[j].charCodeAt(0) - 'a'.charCodeAt(0)]!;
79 q = q.children[target[j].charCodeAt(0) - 'a'.charCodeAt(0)]!;
80 if (!p || !q) {
81 break;
82 }
83 if (p.value < 0 || q.value < 0) {
84 continue;
85 }
86 const t: number = graph[p.value][q.value];
87 result = Math.min(result, t + depthFirstSearch(j + 1));
88 }
89 return (memo[i] = result);
90 };
91
92 // Start the DFS from the first character
93 const answer: number = depthFirstSearch(0);
94
95 // Return the minimum cost or -1 if no valid transformation is possible
96 return answer >= Infinity ? -1 : answer;
97}
98
Time and Space Complexity
Time Complexity
The time complexity of this code involves multiple components:
-
Initializing a 2D array
g
with dimensionsm << 1
bym << 1
(which means2*m
by2*m
, since<< 1
is a left bit shift equivalent to multiplying by 2), wherem
is the length of the arraysoriginal
,changed
, andcost
. This operation isO(m^2)
since it requires initializing anm * m
matrix. -
Inserting strings from
original
andchanged
into trie nodes. Insertion of a word into a trie has a complexity ofO(k)
, wherek
is the length of the word. Since every original and changed string is inserted once, and there arem
such strings, the complexity for this part becomesO(m * k)
, wherek
is the average length of the words inoriginal
andchanged
. However, in this analysis,k
seems to be considered constant, thus this step is included inO(m)
. -
Floyd-Warshall algorithm is used to compute the shortest path, which is
O(m^3)
as it involves three nested loops each running for all the nodes in the graph. -
The
dfs
function is a recursive function with memoization that explores multiple paths in thesource
andtarget
strings. This complexity is harder to analyze directly but a rough upper bound can beO(n^2)
wheren
is the length of thesource
andtarget
, noting that due to memoization and early termination it may often be much less. -
The double for-loop within the
dfs
function seems to contribute an additionalO(m * n)
since it iterates through two pointers insource
andtarget
betweeni
andlen(source)
while takingm
into account through the checks onp.v
andq.v
.
Overall, combining these operations, the time complexity is O(m^3 + n^2 + m * n)
.
Space Complexity
The space complexity is determined by:
-
The space used by the memoization cache in the
dfs
function, which can go up toO(n)
. -
The 2D array
g
of size(m << 1) * (m << 1)
contributesO(m^2)
. -
The space taken by the trie to store up to
2*m
nodes (m
fromoriginal
andm
fromchanged
), each containing an array of 26 children pointers (due to the English lowercase alphabet), isO(26*m)
which simplifies toO(m)
. -
Additional space of
O(n)
for the call stack due to recursion in thedfs
function.
Combining these contributions, the space complexity is O(m^2 + m * n + n)
.
Learn more about how to find time and space complexity quickly using problem constraints.
A person thinks of a number between 1 and 1000. You may ask any number questions to them, provided that the question can be answered with either "yes" or "no".
What is the minimum number of questions you needed to ask so that you are guaranteed to know the number that the person is thinking?
Recommended Readings
https algomonster s3 us east 2 amazonaws com cover_photos graph svg Graph Fundamentals Tree with 0 cycle At this point you should be pretty familiar with trees A tree is a special kind of graph a connected acyclic cycle less graph A graph may contain cycle s and nodes could
Trie Introduction Imagine you have a magic dictionary where as soon as you start spelling a word it immediately tells you how many words that start with those letters How cool would that be This is exactly the kind of magic a data structure called trie performs What's a Trie
What is Dynamic Programming Prerequisite DFS problems dfs_intro Backtracking problems backtracking Memoization problems memoization_intro Pruning problems backtracking_pruning Dynamic programming is an algorithmic optimization technique that breaks down a complicated problem into smaller overlapping sub problems in a recursive manner and uses solutions to the sub problems to construct a solution
Want a Structured Path to Master System Design Too? Don’t Miss This!