Solution: Minimum Height Trees

Let's solve the Minimum Height Trees problem using the Topological Sort pattern.

We'll cover the following

Statement
Solution
- Naive approach
- Optimized approach using Topological Sort

Statement#

You are given $n$ as the number of nodes and a 2D array of edges representing a tree having the following properties:

It is an acyclic, undirected graph containing $n$ nodes and $n-1$ edges.
The nodes are labeled from $0$ to $n-1$ .
An edge at the $i^{th}$ index is represented as $[a_i, b_i]$ , which denotes that there is an undirected edge connecting the nodes $a_i$ and $b_i$ .

Your task is to find all possible root nodes that can be used to create trees of minimum height. The height of a tree is the length of the longest path from the root node to any leaf node.

Note: The order of root nodes in the result does not matter.

Constraints

$1 \leq n \leq 500$
edges.length $== n - 1$
$a_i \neq b_i$
Each edge is unique.

Solution#

So far, you have probably brainstormed some approaches and have an idea of how to solve this problem. Let’s explore some of these approaches and figure out which one to follow based on considerations such as time complexity and any implementation constraints.

Naive approach#

A naive approach is to use Depth-first Search (DFS) to repeatedly select a node, treat it as the root of a tree, and perform DFS to calculate the height of the tree. The goal is to find the set of nodes that, when chosen as roots, result in trees with the minimum possible height.

The time complexity of this approach is $O(n^2)$ , where $n$ is the number of nodes. This is because the DFS from each node takes $O(n)$ time, and we repeat this process for $n$ nodes.

The space complexity of this approach is $O(n)$ , since the recursive call stack (recursive approach), or stack data structure (iterative approach) can store at most $n$ elements.

Optimized approach using Topological Sort#

The idea is to select the central nodes of the tree as root candidates, as these nodes distribute the height as evenly as possible among the subtrees. This is because when choosing nodes from the center, we effectively divide the tree into subtrees with roughly the same number of nodes, making it more likely that the subtrees will have similar heights. Otherwise, if we choose a root node that is closer to one side of the tree, it may result in one subtree being much deeper than the other, leading to an unbalanced tree with a larger height.

To understand this concept, let’s take an example of a circle. In a circle, the diameter is the longest path from one point to another that passes through the center. Similarly, consider the tree as a circle, where the leaf nodes make up the circumference. We consider the diameter of the tree, that is, the maximum length path between two leaf nodes, where no node is repeated. The center node(s) in this diameter will be the root node(s) from which the height can be minimized.

We can safely assume that a tree can contain at most two possible central nodes that can be chosen as the root to minimize its height. To explain this, consider the diameter of a tree as a horizontal chain that can be categorized as follows:

Odd-length chain: The middle node in this chain will be the root node that minimizes the height. This is because the length of the path from this node to the leaf node will be the minimum length path (height). No other node can be selected as the root since it will lead to a longer length path (height) to a leaf node.
Even-length chain: The two middle nodes in this chain will be the root nodes that minimize the height. We can select two candidate nodes in this case since an even-length chain does not contain a single middle node. The length of the path from these nodes to the leaf node will be the minimum length path (height).

The slides below illustrate this concept:

To obtain the center point of a circle, we can remove the points from its circumference. Now, we have a new circumference. We repeat this process of removing points on the circumference until only the center point is left.

Similarly, we consider the tree as a circle, where the leaf nodes form the circumference. We remove all the leaf nodes, thereby obtaining a new set of leaf nodes. We repeat this process of removing leaf nodes until we reach the central node(s), which are either $1$ , or $2$ nodes.

The slides below illustrate this concept:

Here are the steps of the algorithm:

If the number of nodes in the tree is less than, or equal to $2$ , the tree contains only its central node(s). Therefore, return the node(s) in an array.
Otherwise, traverse the edges array to create an adjacency list, adj, to make it easier for us to identify and remove the leaf nodes. The adjacency list will be a 2D array where each array adj[i] will contain the neighbors of the $i^{th}$ node.
Identify the first group of leaf nodes and store them in an array (initially empty), leaves:
- Traverse adj to identify the leaf nodes. For each adj[i], we will check if its length is equal to $1$ :
  - If it is, the $i^{th}$ node is a leaf node, since it is connected to only one other node. Therefore, we will add it to the leaves array.
  - Otherwise, we will skip, adj[i] , since the $i^{th}$ node is not a leaf node.
Remove the leaf nodes from the tree by updating adj, until the number of remaining nodes is less than, or equal to $2$ . The number of remaining nodes are tracked using the variable rem_nodes, which was initialized to $0$ :
- Perform some preliminary steps before removing the leaf nodes:
  - Update rem_nodes by subtracting it from n. This will give us the number of nodes left in the tree.
  - Initialize an empty array, temp_leaves, that will temporarily store the new leaves of the resulting tree, while the outer layer of leaves is being removed.
- We remove each leaf node from the end of the leaves array until it is empty:
  - The removed leaf node is stored in a variable named leaf.
  - We remove the edge from the leaf node to its only neighbor by removing the node at adj[leaf]. This removed node is stored in a variable, neighbor:
```
adj[leaf].pop()
```
  - Similarly, we also remove the edge from the neighbor to the leaf node by removing the leaf node from adj[neighbor]:
```
adj[neighbor].remove(leaf)
```
  - Lastly, we check if the neighbor is a now a new leaf node by checking if the length of neighbors[neighbor] is equal to $1$ . If it is, we add neighbor to temp_leaves.
  - The process above is repeated until all leaf nodes have been removed from the leaves array. This means that the outer layer of leaf nodes has now been removed from the tree. The temp_leaves array now contains the next layer of leaf nodes. Therefore, we update leaves to temp_leaves.
The process above is repeated, until rem_nodes is less than, or equal to $2$ , in which case only the central nodes remain in the tree. Therefore, leaves now contains these central nodes, so we return it.

The slides below illustrate how the algorithm runs:

# Function to construct an adjacency list representation of a graph
def construct_adjacency_list(n, edges):
    adj = [set() for i in range(n)]
    for v1, v2 in edges:
        adj[v1].add(v2)
        adj[v2].add(v1)
    return adj
# Function to find the minimum height trees in a graph
def min_height_trees(n, edges):
    if n <= 2:
        return [i for i in range(n)]
    adj = construct_adjacency_list(n, edges)
    leaves = [i for i in range(n) if len(adj[i]) == 1]
    rem_nodes = n
    while rem_nodes > 2:
        rem_nodes -= len(leaves)
        temp_leaves = []
        while leaves:
            leaf = leaves.pop()
            neighbor = adj[leaf].pop()
            adj[neighbor].remove(leaf)
            if len(adj[neighbor]) == 1:

Enter to Rename, Shift+Enter to Preview

Minimum Height Trees

Solution summary#

The solution can be summarized in the following steps:

Construct an adjacency list representation of a graph from given edges.
If the graph has $1$ or $2$ nodes, return them as the root nodes of the minimum height trees.
Iteratively prune the graph by removing leaf nodes (nodes with only one neighbor).
Repeat this process until only $1$ or $2$ nodes are left in the graph. These remaining nodes are returned as the root nodes for the minimum height trees.

Time complexity#

The time complexity of this solution is $O(n)$ .

Explanation:

Time taken to construct the adjacency list: $O(n - 1)$ .
Time taken to populate the leaves array with the first layer of leaf nodes: $O(n)$ .
Time taken to remove all the leaf nodes until the central nodes are obtained: $O(n + 2(n - 1)) = O(n)$

Therefore, the overall time complexity becomes $O(n - 1 + n + n) = O(n)$ .

Space complexity#

The space complexity of this solution is $O(n)$ .

Explanation:

Space occupied by the adjacency list: $O(n + (n - 1)) = O(n)$ .
Space occupied by the leaves array: $O(n)$ .

So the overall space complexity becomes $O(n + n) = O(n)$ .

Minimum Height Trees

Serialize and Deserialize Binary Tree