333. Largest BST Subtree
Problem Description
The problem asks to find the largest Binary Search Tree (BST) within a given binary tree. A largest BST here refers to a BST with the maximum number of nodes. It's important to remember that a binary tree is not necessarily a BST. A BST is a specific type of binary tree where the left subtree contains only nodes with values less than the root's value, and the right subtree only nodes with values greater than the root's value. Each subtree must also follow these rules.
In this context, a "subtree" is considered any node and all its descendants, and we need to identify the largest one that fulfills the constraints of a BST within the original binary tree. The solution should return the number of nodes contained in this largest BST.
Flowchart Walkthrough
Let's determine the appropriate algorithm for solving Leetcode problem 333, Largest BST Subtree, using the Flowchart. Here’s how we navigate through the decision-making process:
-
Is it a graph?
- No: This problem involves analyzing the structure of a binary tree to find the largest BST (Binary Search Tree) subtree.
-
Is it a tree?
- Yes: The problem explicitly deals with a binary tree.
-
Apply DFS
- Since the task is to find the largest BST subtree within a given tree, a DFS approach is suitable. DFS will allow us to check each node and its children recursively to verify BST properties and determine the size of the largest BST subtree at each node.
Conclusion: According to the flowchart, the problem best fits the Depth-First Search (DFS) pattern as it involves processing on a tree data structure to find a particular kind of subtree.
Intuition
This problem can be efficiently solved using a recursive depth-first search (DFS). The intuition is that for a subtree to be a BST, both its left and right children must also be BSTs, and the value of all nodes in the left subtree must be less than the root's value, and those in the right subtree must be greater.
The solution leverages a helper function, typically dfs(root)
, which performs the following tasks:
- It checks the validity of BST for each node.
- It keeps track of the number of nodes in the valid subtree.
- It keeps track of the maximum and minimum value within the subtree to easily compare with its parent.
This recursive function returns a tuple (min_value, max_value, number_of_nodes)
for each subtree analyzed. If the current subtree does not form a valid BST when considering its parent, it returns values that will cause the parent to also be invalid as a BST. This way, one does not need to further check subtrees of an already invalid BST, optimizing the process.
A key point for optimizing this algorithm is the early return when an invalid BST is found. Instead of checking all subtrees, the function stops early when it finds that the current subtree cannot possibly be a BST. This reduces the number of unnecessary checks and, therefore, the overall time complexity of the algorithm.
By maintaining a global or nonlocal variable that keeps track of the size of the largest BST found so far (ans
in this case), we can keep updating the size whenever we find a larger valid BST.
At the end, the variable ans
will hold the size of the largest found BST subtree, which we can return as the solution.
Learn more about Tree, Depth-First Search, Binary Search Tree, Dynamic Programming and Binary Tree patterns.
Solution Approach
The solution approach to finding the largest BST in a binary tree involves implementing a recursive Depth-First Search (DFS) that traverses the tree while performing checks and computations that ultimately reveal the largest BST subtree.
Here's a step-by-step breakdown of the dfs(root)
function and how it integrates within the solution:
-
A recursion base case is set. When the function encounters a
None
node (indicating an empty subtree), it returns(inf, -inf, 0)
. This ensures that an empty subtree will not affect the validity of its parent. -
The
dfs
function is called recursively on the left childdfs(root.left)
and the right childdfs(root.right)
of the currentroot
. Each call will return the minimum value, maximum value, and node count of the valid BST subtree rooted at that child. -
These values are used to determine if the current
root
with its subtrees can form a valid BST:- The maximum value in the left subtree (
lmx
) must be less thanroot.val
. - The minimum value in the right subtree (
rmi
) must be greater thanroot.val
.
- The maximum value in the left subtree (
-
When the current node's value respects the BST property in relation to its children's values, the current subtree rooted at
root
is a valid BST. The node count is the sum of the node counts from the left and right subtrees plus one (the currentroot
node). -
The global variable
ans
is updated with the maximum value between its current value and the newly found BST's node count. -
The function returns a tuple containing the new minimum value, maximum value, and the total node count of the valid BST. The minimum and maximum values are updated to ensure the parent node will be able to verify its BST properties correctly.
-
If the current subtree does not form a valid BST, the function returns
(-inf, inf, 0)
to indicate an invalid subtree. -
Once the entire tree has been traversed, the accumulated value of
ans
will represent the largest number of nodes found in a BST within the tree.
The code uses Python's nonlocal
keyword to modify the ans
variable defined outside of the nested dfs
function, which simplifies updating the count of the largest BST found during the recursive traversal.
In terms of data structures, the algorithm primarily operates on the binary tree itself and uses tuples to store and pass around multiple values compactly. The recursion stack implicitly created by the recursive calls is a critical part of the DFS traversal pattern applied here.
By systematically exploring each node and determining locally whether a valid BST exists at that node, then combining those local decisions to inform the parent node's validity, this approach effectively breaks down the problem into manageable subproblems adhering to the "divide and conquer" strategy, which is common in recursive algorithms.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's illustrate the solution approach with a small example. Suppose you have the following binary tree:
10 / \ 5 15 / \ \ 1 8 7
Each node contains a value, and we aim to find the largest BST within this binary tree.
-
We start with a DFS on the root node (10). We use the
dfs
function, which will return the minimum value, maximum value, and node count of the valid BST subtree. -
We call
dfs
recursively on the left child (5) and right child (15). -
For node 5, we call
dfs
recursively on node 1 (left child) and node 8 (right child).- Node 1 returns
(1, 1, 1)
because it has no children and is a valid BST on its own. - Node 8 returns
(8, 8, 1)
for the same reason. - Since the maximum value from the left
(1)
is less than 5, and the minimum value from the right(8)
is greater than 5, node 5 with its children is a valid BST. We return(1, 8, 3)
for node 5.
- Node 1 returns
-
For node 15, we don't have a left child, and the right child is node 7.
- Since 15 has no left child, we receive
(inf, -inf, 0)
from the left. - Node 7 returns
(7, 7, 1)
because it's a leaf node and hence a valid BST. - However, the minimum value from the right (7) is not greater than 15, so node 15 with its children cannot form a BST. Therefore, we return
(-inf, inf, 0)
for node 15, indicating it's an invalid subtree.
- Since 15 has no left child, we receive
-
Now, back at the root node 10, we use the results from its children to determine whether it can form a BST. The left child returned
(1, 8, 3)
and the right child(-inf, inf, 0)
.- The maximum value from the left (
8
) is less than 10, but we cannot use the results from the right child because it's invalid. - So, node 10 and its entire tree cannot form a valid BST. We return
(-inf, inf, 0)
for node 10.
- The maximum value from the left (
-
With these recursive calls, we maintain a global variable
ans
that is updated every time a valid subtree BST is found. In our case, the largest BST is the subtree rooted at node 5 with 3 nodes. -
Once we have completed the recursive DFS traversal, the accumulated value of
ans
will hold the size of the largest BST found within the binary tree, which is 3 in this example. We return this value as the solution to the problem.
Solution Implementation
1# Definition for a binary tree node.
2# class TreeNode:
3# def __init__(self, val=0, left=None, right=None):
4# self.val = val
5# self.left = left
6# self.right = right
7
8from math import inf
9
10class Solution:
11 def largestBSTSubtree(self, root: Optional[TreeNode]) -> int:
12 # Helper function to perform a DFS on the binary tree.
13 def dfs(node):
14 # Base case: If the node is None, return a tuple containing infinity,
15 # negative infinity, and 0, indicating an empty subtree.
16 if node is None:
17 return inf, -inf, 0
18
19 # Recursively obtain minimum value, maximum value, and size of the
20 # left and right subtrees.
21 left_min, left_max, left_size = dfs(node.left)
22 right_min, right_max, right_size = dfs(node.right)
23
24 # Use a nonlocal variable to keep track of the largest BST size found so far.
25 nonlocal largest_bst_size
26
27 # Check if the current tree rooted at 'node' is a BST by comparing its value
28 # with the maximum of the left subtree and the minimum of the right subtree.
29 if left_max < node.val < right_min:
30 # Update the largest BST size if the current subtree is larger.
31 largest_bst_size = max(largest_bst_size, left_size + right_size + 1)
32 # Return a tuple with the new minimum, maximum, and size of the current subtree.
33 return min(left_min, node.val), max(right_max, node.val), left_size + right_size + 1
34 else:
35 # If the current subtree is not a BST, return values that convey a violation
36 # in the BST property to parent nodes.
37 return -inf, inf, 0
38
39 # Initialize the largest BST size to 0.
40 largest_bst_size = 0
41 # Perform a DFS on the binary tree starting from the root.
42 dfs(root)
43 # After the DFS is completed, return the largest BST size found.
44 return largest_bst_size
45
1/**
2 * Definition for a binary tree node.
3 */
4class TreeNode {
5 int val;
6 TreeNode left;
7 TreeNode right;
8
9 TreeNode() {}
10
11 TreeNode(int val) {
12 this.val = val;
13 }
14
15 TreeNode(int val, TreeNode left, TreeNode right) {
16 this.val = val;
17 this.left = left;
18 this.right = right;
19 }
20}
21
22class Solution {
23 private int maxBSTSize; // Variable to track the size of the largest BST.
24
25 /**
26 * Returns the size of the largest BST in a binary tree.
27 *
28 * @param root the root of the binary tree
29 * @return the size of the largest BST
30 */
31 public int largestBSTSubtree(TreeNode root) {
32 maxBSTSize = 0;
33 dfs(root);
34 return maxBSTSize;
35 }
36
37 /**
38 * Performs a depth-first search to find the largest BST.
39 *
40 * @param node the current node
41 * @return an array containing the minimum value, the maximum value,
42 * and the size of the largest BST in the subtree rooted at the current node
43 */
44 private int[] dfs(TreeNode node) {
45 // Base case: if node is null, return "empty" values that won't affect parent node's calculation.
46 if (node == null) {
47 return new int[]{Integer.MAX_VALUE, Integer.MIN_VALUE, 0};
48 }
49
50 // Recursive DFS calls for left and right subtrees.
51 int[] leftSubtree = dfs(node.left);
52 int[] rightSubtree = dfs(node.right);
53
54 // Check if current node's subtree is a BST.
55 if (leftSubtree[1] < node.val && node.val < rightSubtree[0]) {
56 // Calculate the size of this subtree (if it's a valid BST).
57 int sizeOfCurrentBST = leftSubtree[2] + rightSubtree[2] + 1;
58
59 // Update the answer if this is the largest BST so far.
60 maxBSTSize = Math.max(maxBSTSize, sizeOfCurrentBST);
61
62 // Return the updated information for this valid BST.
63 return new int[]{
64 Math.min(node.val, leftSubtree[0]), // Minimum value in the subtree.
65 Math.max(node.val, rightSubtree[1]), // Maximum value in the subtree.
66 sizeOfCurrentBST // Size of the subtree.
67 };
68 }
69 // If the subtree with this node is not a BST, return "invalid" values.
70 return new int[]{Integer.MIN_VALUE, Integer.MAX_VALUE, 0};
71 }
72}
73
1/**
2 * Definition for a binary tree node.
3 */
4struct TreeNode {
5 int val;
6 TreeNode *left;
7 TreeNode *right;
8 TreeNode() : val(0), left(nullptr), right(nullptr) {}
9 TreeNode(int x) : val(x), left(nullptr), right(nullptr) {}
10 TreeNode(int x, TreeNode *left, TreeNode *right) : val(x), left(left), right(right) {}
11};
12
13class Solution {
14public:
15 int largestSize; // Using "largestSize" to store largest BST subtree size.
16
17 // Public member function to initiate the computation of largest BST subtree size.
18 int largestBSTSubtree(TreeNode* root) {
19 largestSize = 0;
20 dfsHelper(root);
21 return largestSize;
22 }
23
24 // Helper function to perform depth-first search and return essential info of each subtree.
25 vector<int> dfsHelper(TreeNode* root) {
26 // Base case: If current node is null, return maximum, minimum and size as specified.
27 if (!root) return {INT_MAX, INT_MIN, 0};
28
29 // Recursive cases for left and right subtrees.
30 auto leftInfo = dfsHelper(root->left);
31 auto rightInfo = dfsHelper(root->right);
32
33 // If current subtree is BST, calculate subtree size and update "largestSize".
34 if (leftInfo[1] < root->val && root->val < rightInfo[0]) {
35 int subtreeSize = leftInfo[2] + rightInfo[2] + 1;
36 largestSize = max(largestSize, subtreeSize);
37 // Return a vector containing the minimum value, maximum value, and size of this BST.
38 return {min(root->val, leftInfo[0]), max(root->val, rightInfo[1]), subtreeSize};
39 }
40
41 // If current subtree is not BST, return values that invalidate the parent node's BST status.
42 return {INT_MIN, INT_MAX, 0};
43 }
44};
45
1// Definition for a binary tree node in TypeScript.
2interface TreeNode {
3 val: number;
4 left: TreeNode | null;
5 right: TreeNode | null;
6}
7
8// Variables and functions declared globally as per instructions.
9
10// Variable to store largest BST subtree size.
11let largestSize: number = 0;
12
13/**
14 * Initiates the computation of the largest BST subtree size.
15 * @param root - The root node of the binary tree.
16 * @returns The size of the largest BST subtree.
17 */
18function largestBSTSubtree(root: TreeNode | null): number {
19 largestSize = 0;
20 dfsHelper(root);
21 return largestSize;
22}
23
24/**
25 * Helper function to perform depth-first search and returns essential info of each subtree.
26 * @param root - The current node of the binary tree.
27 * @returns An array containing the minimum value, maximum value, and size of the subtree.
28 */
29function dfsHelper(root: TreeNode | null): [number, number, number] {
30 // Base case: If current node is null, return maximum, minimum and size as specified.
31 if (!root) return [Number.MAX_SAFE_INTEGER, Number.MIN_SAFE_INTEGER, 0];
32
33 // Recursive cases for left and right subtrees.
34 let leftInfo = dfsHelper(root.left);
35 let rightInfo = dfsHelper(root.right);
36
37 // If current subtree is BST, calculate subtree size and update "largestSize".
38 if (leftInfo[1] < root.val && root.val < rightInfo[0]) {
39 let subtreeSize = leftInfo[2] + rightInfo[2] + 1;
40 largestSize = Math.max(largestSize, subtreeSize);
41 // Return an array containing the min value, max value, and size of this BST.
42 return [Math.min(root.val, leftInfo[0]), Math.max(root.val, rightInfo[1]), subtreeSize];
43 }
44
45 // If current subtree is not BST, return values that invalidate the parent node's BST status.
46 return [Number.MIN_SAFE_INTEGER, Number.MAX_SAFE_INTEGER, 0];
47}
48
Time and Space Complexity
The given code defines a function largestBSTSubtree
that finds the size of the largest BST (Binary Search Tree) within a binary tree. Let's analyze its time and space complexity.
Time Complexity
The main computation in the code is done by the dfs
(Depth-First Search) function, which is a recursive function that is called on every node of the tree exactly once. At each node, dfs
performs a constant amount of work. It checks if the current subtree is a BST, and updates the minimum and maximum value seen so far along with the count of nodes in the subtree.
Thus, for a binary tree with n
nodes, every node is visited once, resulting in a time complexity of O(n)
.
Space Complexity
The space complexity of the code comes from two parts: the recursion stack and the nonlocal variable ans
.
-
Recursion Stack: In the worst case, the recursion goes as deep as the height of the tree. For a balanced binary tree, the height is
log(n)
, leading to a space complexity ofO(log(n))
because of the recursion stack. However, in the worst case (a degenerate tree where each node has only one child), the height of the tree becomesn
, which would result in a space complexity ofO(n)
for the recursion stack. -
Nonlocal Variable: The nonlocal variable
ans
is used to track the maximum size of the BST found at any node. This only uses a constant amount of space,O(1)
.
The overall space complexity is then the maximum space required by any part of the algorithm. Thus, the space complexity of the algorithm is O(h)
where h
is the height of the tree. In the worst case, this simplifies to O(n)
.
Learn more about how to find time and space complexity quickly using problem constraints.
How does merge sort divide the problem into subproblems?
Recommended Readings
Everything About Trees A tree is a type of graph data structure composed of nodes and edges Its main properties are It is acyclic doesn't contain any cycles There exists a path from the root to any node Has N 1 edges where N is the number of nodes in the tree and
https algomonster s3 us east 2 amazonaws com cover_photos dfs svg Depth First Search Prereqs Recursion Review problems recursion_intro Trees problems tree_intro With a solid understanding of recursion under our belts we are now ready to tackle one of the most useful techniques in coding interviews Depth First Search DFS
Binary Search Tree Intro Definition A Binary Search Tree or BST is a rooted binary tree with the value of each internal node being greater than all the values in the respective node's left subtree and less than the ones in its right subtree An empty tree is a BST since it satisfies the
Want a Structured Path to Master System Design Too? Don’t Miss This!