2003. Smallest Missing Genetic Value in Each Subtree
Problem Description
In this problem, we are given a family tree with n
nodes, each node has a unique genetic value. Our task is to find the smallest genetic value that is not present in the subtree of each node. The tree is uniquely represented by an array parents
where parents[i]
gives us the parent of the node i
. The root of the tree is the node with index 0
(thus parents[0] == -1
).
Every node has a distinct genetic value, given by the array nums
. Genetic values are within the range of [1, 10^5]
. We are supposed to return an array ans
where ans[i]
is the smallest missing genetic value in the subtree rooted at node i
. A subtree rooted at a node x
consists of node x
and all its descendants.
Flowchart Walkthrough
First, let's pin down the algorithm using the Flowchart. Here's a step-by-step walkthrough:
Is it a graph?
- Yes: The problem models the structure given (a tree where nodes represent genetic values) as a graph.
Is it a tree?
- Yes: The structure provided is a tree since each node (except one) has exactly one parent.
Perform DFS for tree-based exploration:
- Here, the key requirement is to explore subtree structures and maintain state relationally through parent-child dependencies, perfectly suited for DFS as it excels in hierarchical data structures like trees.
Conclusion: The flowchart suggests using DFS for exploring and processing the tree structure effectively in order to solve the problem related to the smallest missing genetic value in each subtree.
Intuition
The challenge here is that we could traverse the subtree of each node and collect all genetic values present, but that would be highly inefficient for a large number of nodes. So the key insight is to leverage the unique genetic value property - especially the node with the genetic value of 1.
We begin by noting that if a subtree doesn't contain the genetic value of 1, then 1 is the smallest missing value for all nodes in that subtree. This allows us to initialize the answer array ans
with ones.
We then locate the node idx
that has the genetic value of 1. From this node, we move upwards to the root, and for each of these nodes, we find the smallest missing genetic value in their subtrees. We use depth-first search (DFS) starting from idx
to mark all the genetic values that appear in its subtree.
If idx
is not found, which means the genetic value of 1 is missing in the entire tree, then all nodes have an answer of 1, and there's no need to proceed further.
If we do find idx
, we have to dynamically update the smallest missing genetic values while traveling upwards towards the root. For that, we use a flag array vis
to keep track of the visited nodes (to avoid revisiting during DFS), and an array has
to mark which genetic values are present.
We continue the DFS process to mark present genetic values, and after each subtree traversal, we update the corresponding node's smallest missing genetic value in the answer array ans
. Once a node is processed, we move to its parent and repeat until we reach the root.
By maintaining an iterative index i
(starting from 2, the first non-present value after 1) and incrementing it whenever we find the corresponding genetic value in has
, we efficiently find the next smallest missing genetic value.
Finally, we return the array ans
, containing the smallest missing genetic values for all nodes' subtrees.
Learn more about Tree, Depth-First Search, Union Find and Dynamic Programming patterns.
Solution Approach
The solution utilizes a Depth-First Search (DFS) algorithm to traverse the tree along with additional data structures to optimize the search for the smallest missing genetic values.
The code initializes a graph g
to represent the family tree. Each index of the graph corresponds to a node in the tree, and it contains a list of child nodes. This graph is used during DFS to iterate over the children of a node.
A crucial part of the algorithm is identifying the node idx
with the genetic value of 1 because this serves as a starting point for our optimized search. From there, the solution implements the following steps iteratively:
-
A DFS traversal begins at the current
idx
, marking genetic values as present in the subtree rooted at this node. This is done using a helper functiondfs(i)
, wherei
is the current node. Any genetic valuenums[i]
that is found during traversal and is less than the length of thehas
array is marked asTrue
inhas[nums[i]]
to indicate its presence. -
The
vis
array is used to mark nodes that have been visited to prevent revisiting nodes, which could lead to an infinite loop or unnecessary computations. -
After the DFS traversal, the code searches for the smallest missing value starting from
i = 2
, since the genetic value of 1 is already accounted for by our starting indexidx
. It looks for the first value not marked as present in thehas
array. -
The smallest missing value found is assigned to
ans[idx]
. This corresponds to the smallest missing genetic value in the subtree rooted atidx
. -
The current
idx
is then updated toparents[idx]
, moving up one level in the tree to the parent of the current node. -
This process repeats until the root node is reached (
idx == -1
), and the smallest missing genetic values for the nodes along the path from the original node with the genetic value of 1 to the root node are updated in theans
array. -
The nodes that are not on the path from the node with genetic value of 1 to the root have an answer of 1, as their subtrees do not contain the genetic value of 1.
This solution approach is efficient because it avoids checking the entire subtree for each node. Instead, it focuses on the path from the node with genetic value 1 to the root and leverages the fact that the subtrees of nodes outside this path will not influence the smallest missing genetic values along this path. Additionally, the has
array dynamically tracks which values are found, allowing for an efficient search for missing values.
Finally, the ans
array is returned, providing the smallest missing genetic values for each node's subtree.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's illustrate the solution approach with a small example.
Suppose we have the following family tree with n = 5
nodes and the parents
array: [-1, 0, 1, 1, 2]
. This indicates that:
- Node
0
is the root asparents[0] == -1
. - Node
1
is a child of node0
. - Node
2
is a child of node1
. - Node
3
is also a child of node1
. - Node
4
is a child of node2
.
The nums
array gives us the genetic values: [3, 1, 5, 4, 2]
:
- Node
0
has a genetic value of3
. - Node
1
has a genetic value of1
. This is our starting point since it's the unique node with value1
. - Node
2
has a genetic value of5
. - Node
3
has a genetic value of4
. - Node
4
has a genetic value of2
.
First, we initialize our answer array ans
with ones, assuming that the smallest missing genetic value for each node's subtree is 1
: ans = [1, 1, 1, 1, 1]
.
Since node 1
has the genetic value 1
, we begin a DFS traversal from this node.
While we traverse the subtree rooted at node 1
, we mark in the has
array which genetic values we encounter. Starting from node 1
, we visit nodes 2
and 4
. We update the has
array with the genetic values found: has = [False, True, True, False, True]
(the indexing is 1-based for genetic values).
After visiting the subtree of node 1
, we look for the smallest missing genetic value in has
. Since has[3]
is False
, the smallest missing value for this subtree is 3
. We then update ans[1]
to 3
.
Now we move to the parent of node 1
, which is the root node 0
. We continue our DFS, but since we already visited node 1
's subtree, we now visit node 3
. After visiting, we update our has
array to include the genetic value of 4
: has = [False, True, True, True, True]
.
For node 0
, we again search for the smallest missing genetic value. has[6]
is our next check, but since our genetic values only range from 1
to n
, the smallest missing value is 6
. So we update ans[0]
to 6
.
No other nodes are parents, so we've completed our traversal. The final answer array is ans = [6, 3, 1, 1, 1]
, indicating the smallest missing genetic values for the subtrees of each node.
This approach is efficient for larger trees because we only traverse each node once and update our smallest missing values dynamically as we move through the tree.
Solution Implementation
1from typing import List
2
3class Solution:
4 def smallestMissingValueSubtree(self, parents: List[int], nums: List[int]) -> List[int]:
5 # Depth-First Search function to visit nodes
6 def dfs(current_node: int):
7 # If the current node is already visited, don't revisit
8 if visited[current_node]:
9 return
10 # Mark the current node as visited
11 visited[current_node] = True
12 # If the value at current node is within range, update has_value array
13 if nums[current_node] < len(has_value):
14 has_value[nums[current_node]] = True
15 # Recursively visit all the children of the current node
16 for child in graph[current_node]:
17 dfs(child)
18
19 # Number of nodes
20 num_nodes = len(nums)
21 # Initialize the answer list with 1s
22 answer = [1] * num_nodes
23 # Create an adjacency list to represent the tree
24 graph = [[] for _ in range(num_nodes)]
25 # Initialize the index of node having value 1
26 node_with_one = -1
27 # Construct the graph and find the index of node with value 1
28 for i, parent in enumerate(parents):
29 # Exclude the root node which has no parent
30 if i:
31 graph[parent].append(i)
32 # If the node's value is 1, update node_with_one
33 if nums[i] == 1:
34 node_with_one = i
35 # If no node contains the value 1, return the answer array as it is
36 if node_with_one == -1:
37 return answer
38
39 # Initialize visited array to keep track of visited nodes
40 visited = [False] * num_nodes
41 # Array to track which values are present in the subtree
42 has_value = [False] * (num_nodes + 2)
43 # Missing value to start checking from (initially 2 since 1 is present)
44 missing_value = 2
45
46 # Process ancestors of the node with value 1 and their subtrees
47 while node_with_one != -1:
48 # Perform DFS starting from the current node_with_one to update has_value array
49 dfs(node_with_one)
50 # Find the smallest missing value in the subtree
51 while has_value[missing_value]:
52 missing_value += 1
53 # Update the answer for the current node
54 answer[node_with_one] = missing_value
55 # Move up to the parent of the current node
56 node_with_one = parents[node_with_one]
57
58 return answer
59
1import java.util.ArrayList;
2import java.util.Arrays;
3import java.util.List;
4
5class Solution {
6 private List<Integer>[] adjacencyList; // Graph representation as an adjacency list
7 private boolean[] visited; // Track visited nodes during DFS
8 private boolean[] hasValue; // Track if a subtree has a particular value
9 private int[] nodeValues; // The values assigned to each node
10
11 // Method to find the smallest missing value for each subtree
12 public int[] smallestMissingValueSubtree(int[] parents, int[] nums) {
13 int n = nums.length;
14 nodeValues = nums;
15 adjacencyList = new List[n];
16 visited = new boolean[n];
17 hasValue = new boolean[n + 2]; // n+1 possible node values plus an extra for easier indexing
18
19 Arrays.setAll(adjacencyList, i -> new ArrayList<>());
20 int oneValueIndex = -1; // Index of the node with value 1
21
22 // Build the tree structure from parent pointers
23 for (int i = 0; i < n; ++i) {
24 if (i > 0) {
25 adjacencyList[parents[i]].add(i); // Link child to parent in the list
26 }
27 if (nums[i] == 1) {
28 oneValueIndex = i; // Find the index of the node that contains the value 1
29 }
30 }
31
32 int[] result = new int[n]; // The array to store smallest missing values
33 Arrays.fill(result, 1); // Default value if '1' is not in any subtree
34
35 if (oneValueIndex == -1) {
36 return result; // If there's no node with value '1', return all ones
37 }
38
39 // Perform DFS and update result while moving up the tree
40 for (int missingValue = 2; oneValueIndex != -1; oneValueIndex = parents[oneValueIndex]) {
41 performDFS(oneValueIndex);
42
43 // Find the smallest missing value in the current subtree
44 while (hasValue[missingValue]) {
45 ++missingValue;
46 }
47 result[oneValueIndex] = missingValue; // Assign smallest missing value to the result array
48 }
49
50 return result;
51 }
52
53 // Perform DFS starting from node 'i' to mark values present in the subtree
54 private void performDFS(int i) {
55 if (visited[i]) {
56 return; // If the node is already visited, we do nothing
57 }
58
59 visited[i] = true; // Mark the node as visited
60
61 // Mark the value present in the current node if in range
62 if (nodeValues[i] < hasValue.length) {
63 hasValue[nodeValues[i]] = true;
64 }
65
66 // Recursively visit all connected nodes
67 for (int child : adjacencyList[i]) {
68 performDFS(child);
69 }
70 }
71}
72
1#include <vector>
2#include <cstring>
3#include <functional>
4using namespace std;
5
6class Solution {
7public:
8 vector<int> smallestMissingValueSubtree(vector<int>& parents, vector<int>& nums) {
9 int n = nums.size(); // Number of nodes in the tree
10 vector<int> adjacencyList[n]; // Adjacency list for representing the tree structure
11 bool visited[n]; // Track visited nodes during DFS
12 bool hasValue[n + 2]; // Track if a value exists in the subtree rooted at the node (indexed by value)
13 memset(visited, false, sizeof(visited)); // Initialize visited array
14 memset(hasValue, false, sizeof(hasValue)); // Initialize hasValue array
15
16 int indexWithOne = -1; // To store the index of the node where the value is 1
17 for (int i = 0; i < n; ++i) {
18 if (i != 0) { // Skip the root (because the root has no parent)
19 adjacencyList[parents[i]].push_back(i); // Construct the adjacency list for the tree
20 }
21 if (nums[i] == 1) {
22 indexWithOne = i; // Find the node with value 1 and save its index
23 }
24 }
25
26 vector<int> answer(n, 1); // Initialize answer array with 1s
27
28 if (indexWithOne == -1) {
29 // If there is no node with value 1, all answers are 1
30 return answer;
31 }
32
33 // Define a Depth-First Search (DFS) lambda function
34 function<void(int)> dfs = [&](int node) {
35 if (visited[node]) {
36 return; // If node is already visited, do nothing
37 }
38 visited[node] = true; // Mark this node as visited
39 if (nums[node] < n + 2) {
40 hasValue[nums[node]] = true; // Mark the value that exists in the subtree
41 }
42 for (int child : adjacencyList[node]) {
43 dfs(child); // Run DFS recursively for each child
44 }
45 };
46
47 // Find the smallest missing values for the nodes from the node with value 1 up to the root
48 for (int missingValue = 2; indexWithOne != -1; indexWithOne = parents[indexWithOne]) {
49 dfs(indexWithOne); // Run DFS to populate hasValue array
50 while (hasValue[missingValue]) { // Until we find a missing value...
51 ++missingValue; // ...increment the missing value
52 }
53 answer[indexWithOne] = missingValue; // Set the answer for this node
54 }
55
56 return answer; // Return the final answer vector
57 }
58};
59
1function smallestMissingValueSubtree(parents: number[], nums: number[]): number[] {
2 const nodeCount = nums.length;
3 const graph: number[][] = Array.from({ length: nodeCount }, () => []);
4 const visited: boolean[] = Array(nodeCount).fill(false);
5 const hasValue: boolean[] = Array(nodeCount + 2).fill(false);
6 const result: number[] = Array(nodeCount).fill(1);
7 let oneValueIndex = -1;
8
9 // Populate the graph representation from the parent array and find the index with value 1
10 for (let i = 0; i < nodeCount; ++i) {
11 if (i) {
12 graph[parents[i]].push(i);
13 }
14 if (nums[i] === 1) {
15 oneValueIndex = i;
16 }
17 }
18
19 // If no node with value 1 is found, return the default result array
20 if (oneValueIndex === -1) {
21 return result;
22 }
23
24 // Depth-first search to update visited and hasValue arrays
25 const depthFirstSearch = (nodeIndex: number): void => {
26 if (visited[nodeIndex]) {
27 return;
28 }
29 visited[nodeIndex] = true;
30 if (nums[nodeIndex] < hasValue.length) {
31 hasValue[nums[nodeIndex]] = true;
32 }
33 for (const childIndex of graph[nodeIndex]) {
34 depthFirstSearch(childIndex);
35 }
36 };
37
38 // Search for the smallest missing value starting from the node with value 1
39 for (let missingValue = 2; oneValueIndex !== -1; oneValueIndex = parents[oneValueIndex]) {
40 depthFirstSearch(oneValueIndex);
41 while (hasValue[missingValue]) {
42 ++missingValue;
43 }
44 result[oneValueIndex] = missingValue;
45 }
46
47 return result;
48}
49
Time and Space Complexity
The time complexity of the code is O(n)
where n
is the number of nodes in the tree. This is because every node is accessed once during the DFS (Depth First Search) traversal. The DFS is called once for every node in the path from the node with the value '1' up to the root node. This results in a maximum of n
calls to DFS, each processing each node only once.
The space complexity of the code is O(n)
as well due to several factors:
-
The recursion stack for the DFS, which in the worst case could hold
n
function calls if the tree degenerates into a linked list. -
The
vis
array which is used to keep track of visited nodes in the DFS. The size of this array isn
. -
The
has
array which is used to keep track of the smallest missing values. Its size isn+2
, to cover all nodes values plus an extra space for detecting the missing value that is larger than the number of nodes. -
The
g
array (adjacency list) which is constructed at the start. Each node can have at mostn-1
children, but since it's a tree, there will be a total ofn-1
edges spread out across theg
list, still keeping the total space forg
withinO(n)
.
Summing up all these components still results in a total space complexity of O(n)
.
Learn more about how to find time and space complexity quickly using problem constraints.
Which algorithm should you use to find a node that is close to the root of the tree?
Recommended Readings
Everything About Trees A tree is a type of graph data structure composed of nodes and edges Its main properties are It is acyclic doesn't contain any cycles There exists a path from the root to any node Has N 1 edges where N is the number of nodes in the tree and
https algomonster s3 us east 2 amazonaws com cover_photos dfs svg Depth First Search Prereqs Recursion Review problems recursion_intro Trees problems tree_intro With a solid understanding of recursion under our belts we are now ready to tackle one of the most useful techniques in coding interviews Depth First Search DFS
Union Find Disjoint Set Union Data Structure Introduction Prerequisite Depth First Search Review problems dfs_intro Once we have a strong grasp of recursion and Depth First Search we can now introduce Disjoint Set Union DSU This data structure is motivated by the following problem Suppose we have sets of elements
Want a Structured Path to Master System Design Too? Don’t Miss This!