Solution: Minimum Height Trees

Let's solve the Minimum Height Trees problem using the Topological Sort pattern.

Statement#

You are given nn as the number of nodes and a 2D array of edges representing a tree having the following properties:

  • It is an acyclic, undirected graph containing nn nodes and n1n-1 edges.

  • The nodes are labeled from 00 to n1n-1.

  • An edge at the ithi^{th} index is represented as [ai,bi][a_i, b_i], which denotes that there is an undirected edge connecting the nodes aia_i and bib_i.

Your task is to find all possible root nodes that can be used to create trees of minimum height. The height of a tree is the length of the longest path from the root node to any leaf node.

Note: The order of root nodes in the result does not matter.

Constraints

  • 1n5001 \leq n \leq 500
  • edges.length ==n1== n - 1
  • aibia_i \neq b_i
  • Each edge is unique.

Solution#

So far, you have probably brainstormed some approaches and have an idea of how to solve this problem. Let’s explore some of these approaches and figure out which one to follow based on considerations such as time complexity and any implementation constraints.

Naive approach#

A naive approach is to use Depth-first Search (DFS) to repeatedly select a node, treat it as the root of a tree, and perform DFS to calculate the height of the tree. The goal is to find the set of nodes that, when chosen as roots, result in trees with the minimum possible height.

The time complexity of this approach is O(n2)O(n^2), where nn is the number of nodes. This is because the DFS from each node takes O(n)O(n) time, and we repeat this process for nn nodes.

The space complexity of this approach is O(n)O(n), since the recursive call stack (recursive approach), or stack data structure (iterative approach) can store at most nn elements.

Optimized approach using Topological Sort#

The idea is to select the central nodes of the tree as root candidates, as these nodes distribute the height as evenly as possible among the subtrees. This is because when choosing nodes from the center, we effectively divide the tree into subtrees with roughly the same number of nodes, making it more likely that the subtrees will have similar heights. Otherwise, if we choose a root node that is closer to one side of the tree, it may result in one subtree being much deeper than the other, leading to an unbalanced tree with a larger height.

To understand this concept, let’s take an example of a circle. In a circle, the diameter is the longest path from one point to another that passes through the center. Similarly, consider the tree as a circle, where the leaf nodes make up the circumference. We consider the diameter of the tree, that is, the maximum length path between two leaf nodes, where no node is repeated. The center node(s) in this diameter will be the root node(s) from which the height can be minimized.

We can safely assume that a tree can contain at most two possible central nodes that can be chosen as the root to minimize its height. To explain this, consider the diameter of a tree as a horizontal chain that can be categorized as follows:

  • Odd-length chain: The middle node in this chain will be the root node that minimizes the height. This is because the length of the path from this node to the leaf node will be the minimum length path (height). No other node can be selected as the root since it will lead to a longer length path (height) to a leaf node.

  • Even-length chain: The two middle nodes in this chain will be the root nodes that minimize the height. We can select two candidate nodes in this case since an even-length chain does not contain a single middle node. The length of the path from these nodes to the leaf node will be the minimum length path (height).

The slides below illustrate this concept:

canvasAnimation-image

1 of 11

canvasAnimation-image

2 of 11

canvasAnimation-image

3 of 11

canvasAnimation-image

4 of 11

canvasAnimation-image

5 of 11

canvasAnimation-image

6 of 11

canvasAnimation-image

7 of 11

canvasAnimation-image

8 of 11

canvasAnimation-image

9 of 11

canvasAnimation-image

10 of 11

canvasAnimation-image

11 of 11

To obtain the center point of a circle, we can remove the points from its circumference. Now, we have a new circumference. We repeat this process of removing points on the circumference until only the center point is left.

Similarly, we consider the tree as a circle, where the leaf nodes form the circumference. We remove all the leaf nodes, thereby obtaining a new set of leaf nodes. We repeat this process of removing leaf nodes until we reach the central node(s), which are either 11, or 22 nodes.

The slides below illustrate this concept:

canvasAnimation-image

1 of 9

canvasAnimation-image

2 of 9

canvasAnimation-image

3 of 9

canvasAnimation-image

4 of 9

canvasAnimation-image

5 of 9

canvasAnimation-image

6 of 9

canvasAnimation-image

7 of 9

canvasAnimation-image

8 of 9

canvasAnimation-image

9 of 9

Here are the steps of the algorithm:

  • If the number of nodes in the tree is less than, or equal to 22, the tree contains only its central node(s). Therefore, return the node(s) in an array.

  • Otherwise, traverse the edges array to create an adjacency list, adj, to make it easier for us to identify and remove the leaf nodes. The adjacency list will be a 2D array where each array adj[i] will contain the neighbors of the ithi^{th} node.

  • Identify the first group of leaf nodes and store them in an array (initially empty), leaves:

    • Traverse adj to identify the leaf nodes. For each adj[i], we will check if its length is equal to 11:

      • If it is, the ithi^{th} node is a leaf node, since it is connected to only one other node. Therefore, we will add it to the leaves array.

      • Otherwise, we will skip, adj[i] , since the ithi^{th} node is not a leaf node.

  • Remove the leaf nodes from the tree by updating adj, until the number of remaining nodes is less than, or equal to 22. The number of remaining nodes are tracked using the variable rem_nodes, which was initialized to 00:

    • Perform some preliminary steps before removing the leaf nodes:

      • Update rem_nodes by subtracting it from n. This will give us the number of nodes left in the tree.

      • Initialize an empty array, temp_leaves, that will temporarily store the new leaves of the resulting tree, while the outer layer of leaves is being removed.

    • We remove each leaf node from the end of the leaves array until it is empty:

      • The removed leaf node is stored in a variable named leaf.

      • We remove the edge from the leaf node to its only neighbor by removing the node at adj[leaf]. This removed node is stored in a variable, neighbor:

        adj[leaf].pop()
      • Similarly, we also remove the edge from the neighbor to the leaf node by removing the leaf node from adj[neighbor]:

        adj[neighbor].remove(leaf)
      • Lastly, we check if the neighbor is a now a new leaf node by checking if the length of neighbors[neighbor] is equal to 11. If it is, we add neighbor to temp_leaves.

      • The process above is repeated until all leaf nodes have been removed from the leaves array. This means that the outer layer of leaf nodes has now been removed from the tree. The temp_leaves array now contains the next layer of leaf nodes. Therefore, we update leaves to temp_leaves.

  • The process above is repeated, until rem_nodes is less than, or equal to 22, in which case only the central nodes remain in the tree. Therefore, leaves now contains these central nodes, so we return it.

The slides below illustrate how the algorithm runs:

canvasAnimation-image

1 of 40

canvasAnimation-image

2 of 40

canvasAnimation-image

3 of 40

canvasAnimation-image

4 of 40

canvasAnimation-image

5 of 40

canvasAnimation-image

6 of 40

canvasAnimation-image

7 of 40

canvasAnimation-image

8 of 40

canvasAnimation-image

9 of 40

canvasAnimation-image

10 of 40

canvasAnimation-image

11 of 40

canvasAnimation-image

12 of 40

canvasAnimation-image

13 of 40

canvasAnimation-image

14 of 40

canvasAnimation-image

15 of 40

canvasAnimation-image

16 of 40

canvasAnimation-image

17 of 40

canvasAnimation-image

18 of 40

canvasAnimation-image

19 of 40

canvasAnimation-image

20 of 40

canvasAnimation-image

21 of 40

canvasAnimation-image

22 of 40

canvasAnimation-image

23 of 40

canvasAnimation-image

24 of 40

canvasAnimation-image

25 of 40

canvasAnimation-image

26 of 40

canvasAnimation-image

27 of 40

canvasAnimation-image

28 of 40

canvasAnimation-image

29 of 40

canvasAnimation-image

30 of 40

canvasAnimation-image

31 of 40

canvasAnimation-image

32 of 40

canvasAnimation-image

33 of 40

canvasAnimation-image

34 of 40

canvasAnimation-image

35 of 40

canvasAnimation-image

36 of 40

canvasAnimation-image

37 of 40

canvasAnimation-image

38 of 40

canvasAnimation-image

39 of 40

canvasAnimation-image

40 of 40

Let’s look at the code for this solution below:

Minimum Height Trees

Solution summary#

The solution can be summarized in the following steps:

  • Construct an adjacency list representation of a graph from given edges.

  • If the graph has 11 or 22 nodes, return them as the root nodes of the minimum height trees.

  • Iteratively prune the graph by removing leaf nodes (nodes with only one neighbor).

  • Repeat this process until only 11 or 22 nodes are left in the graph. These remaining nodes are returned as the root nodes for the minimum height trees.

Time complexity#

The time complexity of this solution is O(n)O(n).

Explanation:

  • Time taken to construct the adjacency list: O(n1)O(n - 1).

  • Time taken to populate the leaves array with the first layer of leaf nodes: O(n)O(n).

  • Time taken to remove all the leaf nodes until the central nodes are obtained: O(n+2(n1))=O(n)O(n + 2(n - 1)) = O(n)

Therefore, the overall time complexity becomes O(n1+n+n)=O(n)O(n - 1 + n + n) = O(n).

Space complexity#

The space complexity of this solution is O(n)O(n).

Explanation:

  • Space occupied by the adjacency list: O(n+(n1))=O(n)O(n + (n - 1)) = O(n).

  • Space occupied by the leaves array: O(n)O(n).

So the overall space complexity becomes O(n+n)=O(n)O(n + n) = O(n).

Minimum Height Trees

Serialize and Deserialize Binary Tree