Solution: Minimum Height Trees
Let's solve the Minimum Height Trees problem using the Topological Sort pattern.
Statement#
You are given as the number of nodes and a 2D array of edges representing a tree having the following properties:
-
It is an acyclic, undirected graph containing nodes and edges.
-
The nodes are labeled from to .
-
An edge at the index is represented as , which denotes that there is an undirected edge connecting the nodes and .
Your task is to find all possible root nodes that can be used to create trees of minimum height. The height of a tree is the length of the longest path from the root node to any leaf node.
Note: The order of root nodes in the result does not matter.
Constraints
edges.length- Each edge is unique.
Solution#
So far, you have probably brainstormed some approaches and have an idea of how to solve this problem. Let’s explore some of these approaches and figure out which one to follow based on considerations such as time complexity and any implementation constraints.
Naive approach#
A naive approach is to use Depth-first Search (DFS) to repeatedly select a node, treat it as the root of a tree, and perform DFS to calculate the height of the tree. The goal is to find the set of nodes that, when chosen as roots, result in trees with the minimum possible height.
The time complexity of this approach is , where is the number of nodes. This is because the DFS from each node takes time, and we repeat this process for nodes.
The space complexity of this approach is , since the recursive call stack (recursive approach), or stack data structure (iterative approach) can store at most elements.
Optimized approach using Topological Sort#
The idea is to select the central nodes of the tree as root candidates, as these nodes distribute the height as evenly as possible among the subtrees. This is because when choosing nodes from the center, we effectively divide the tree into subtrees with roughly the same number of nodes, making it more likely that the subtrees will have similar heights. Otherwise, if we choose a root node that is closer to one side of the tree, it may result in one subtree being much deeper than the other, leading to an unbalanced tree with a larger height.
To understand this concept, let’s take an example of a circle. In a circle, the diameter is the longest path from one point to another that passes through the center. Similarly, consider the tree as a circle, where the leaf nodes make up the circumference. We consider the diameter of the tree, that is, the maximum length path between two leaf nodes, where no node is repeated. The center node(s) in this diameter will be the root node(s) from which the height can be minimized.
We can safely assume that a tree can contain at most two possible central nodes that can be chosen as the root to minimize its height. To explain this, consider the diameter of a tree as a horizontal chain that can be categorized as follows:
-
Odd-length chain: The middle node in this chain will be the root node that minimizes the height. This is because the length of the path from this node to the leaf node will be the minimum length path (height). No other node can be selected as the root since it will lead to a longer length path (height) to a leaf node.
-
Even-length chain: The two middle nodes in this chain will be the root nodes that minimize the height. We can select two candidate nodes in this case since an even-length chain does not contain a single middle node. The length of the path from these nodes to the leaf node will be the minimum length path (height).
The slides below illustrate this concept:
1 of 11
2 of 11
3 of 11
4 of 11
5 of 11
6 of 11
7 of 11
8 of 11
9 of 11
10 of 11
11 of 11
To obtain the center point of a circle, we can remove the points from its circumference. Now, we have a new circumference. We repeat this process of removing points on the circumference until only the center point is left.
Similarly, we consider the tree as a circle, where the leaf nodes form the circumference. We remove all the leaf nodes, thereby obtaining a new set of leaf nodes. We repeat this process of removing leaf nodes until we reach the central node(s), which are either , or nodes.
The slides below illustrate this concept:
1 of 9
2 of 9
3 of 9
4 of 9
5 of 9
6 of 9
7 of 9
8 of 9
9 of 9
Here are the steps of the algorithm:
-
If the number of nodes in the tree is less than, or equal to , the tree contains only its central node(s). Therefore, return the node(s) in an array.
-
Otherwise, traverse the
edgesarray to create an adjacency list,adj, to make it easier for us to identify and remove the leaf nodes. The adjacency list will be a 2D array where each arrayadj[i]will contain the neighbors of the node. -
Identify the first group of leaf nodes and store them in an array (initially empty),
leaves:-
Traverse
adjto identify the leaf nodes. For eachadj[i], we will check if its length is equal to :-
If it is, the node is a leaf node, since it is connected to only one other node. Therefore, we will add it to the
leavesarray. -
Otherwise, we will skip,
adj[i], since the node is not a leaf node.
-
-
-
Remove the leaf nodes from the tree by updating
adj, until the number of remaining nodes is less than, or equal to . The number of remaining nodes are tracked using the variablerem_nodes, which was initialized to :-
Perform some preliminary steps before removing the leaf nodes:
-
Update
rem_nodesby subtracting it fromn. This will give us the number of nodes left in the tree. -
Initialize an empty array,
temp_leaves, that will temporarily store the new leaves of the resulting tree, while the outer layer of leaves is being removed.
-
-
We remove each leaf node from the end of the
leavesarray until it is empty:-
The removed leaf node is stored in a variable named
leaf. -
We remove the edge from the leaf node to its only neighbor by removing the node at
adj[leaf]. This removed node is stored in a variable,neighbor:adj[leaf].pop() -
Similarly, we also remove the edge from the neighbor to the leaf node by removing the leaf node from
adj[neighbor]:adj[neighbor].remove(leaf) -
Lastly, we check if the
neighboris a now a new leaf node by checking if the length ofneighbors[neighbor]is equal to . If it is, we addneighbortotemp_leaves. -
The process above is repeated until all leaf nodes have been removed from the
leavesarray. This means that the outer layer of leaf nodes has now been removed from the tree. Thetemp_leavesarray now contains the next layer of leaf nodes. Therefore, we updateleavestotemp_leaves.
-
-
-
The process above is repeated, until
rem_nodesis less than, or equal to , in which case only the central nodes remain in the tree. Therefore,leavesnow contains these central nodes, so we return it.
The slides below illustrate how the algorithm runs:
1 of 40
2 of 40
3 of 40
4 of 40
5 of 40
6 of 40
7 of 40
8 of 40
9 of 40
10 of 40
11 of 40
12 of 40
13 of 40
14 of 40
15 of 40
16 of 40
17 of 40
18 of 40
19 of 40
20 of 40
21 of 40
22 of 40
23 of 40
24 of 40
25 of 40
26 of 40
27 of 40
28 of 40
29 of 40
30 of 40
31 of 40
32 of 40
33 of 40
34 of 40
35 of 40
36 of 40
37 of 40
38 of 40
39 of 40
40 of 40
Let’s look at the code for this solution below:
Solution summary#
The solution can be summarized in the following steps:
-
Construct an adjacency list representation of a graph from given edges.
-
If the graph has or nodes, return them as the root nodes of the minimum height trees.
-
Iteratively prune the graph by removing leaf nodes (nodes with only one neighbor).
-
Repeat this process until only or nodes are left in the graph. These remaining nodes are returned as the root nodes for the minimum height trees.
Time complexity#
The time complexity of this solution is .
Explanation:
-
Time taken to construct the adjacency list: .
-
Time taken to populate the
leavesarray with the first layer of leaf nodes: . -
Time taken to remove all the leaf nodes until the central nodes are obtained:
Therefore, the overall time complexity becomes .
Space complexity#
The space complexity of this solution is .
Explanation:
-
Space occupied by the adjacency list: .
-
Space occupied by the
leavesarray: .
So the overall space complexity becomes .
Minimum Height Trees
Serialize and Deserialize Binary Tree