Solution: Serialize and Deserialize Binary Tree

Let's solve the Serialize and Deserialize Binary Tree problem using the Tree Depth-first Search pattern.

Statement#

Serialize a given binary tree to a file and deserialize it back to a tree. Make sure that the original and the deserialized trees are identical.

  • Serialize: Write the tree to a file.

  • Deserialize: Read from a file and reconstruct the tree in memory.

Serialize the tree into a list of integers, and then, deserialize it back from the list to a tree. For simplicity’s sake, there’s no need to write the list to the files.

Constraints:

  • The number of nodes in the tree is in the range [0,500][0, 500].
  • 1000-1000 \leq Node.value 1000\leq 1000

Solution#

So far, you’ve probably brainstormed some approaches and have an idea of how to solve this problem. Let’s explore some of these approaches and figure out which one to follow based on considerations such as time complexity and any implementation constraints.

Naive approach#

An initial idea may be to store one of the traversals of the tree into a file when serializing a tree and read that traversal back to create a tree when deserializing. However, any one of the traversals is not unique. That is, two different trees can have the same in-order traversal. The same goes for pre-order or post-order traversal as well. As a simple example, consider a right-slanting degenerate tree and a left-slanting degenerate tree. Both of these trees have the same in-order, pre-order as well as post-order traversals, but the are different trees. However, two traversals are sufficient to uniquely represent and reconstruct a binary tree.

For serialization, this approach will store both the inorder and preorder traversal and place a delimiter to separate them.

For deserialization, pick each node from the preorder traversal, make it root, and find its index in the inorder traversal. The nodes to the left of that index will be the part of the left subtree, and the nodes to the right of that index will be the part of the right subtree.

We got the required solution but at what cost? Can we avoid using twice the space? Can we avoid using tree traversal twice?

We can save the extra space by storing the preorder traversal of the tree and a marker for NULL pointers. The marker is used to identify the NULL nodes of the tree, which is helpful in deserializing the binary tree. We use the preorder (root, left, right) traversal to serialize the whole tree. While deserializing the binary tree, it’s easy to reconstruct the entire tree using preorder traversal. We start by creating the root, and then its left and right children, respectively. The preorder traversal comes under depth-first search.

Here is how we’ll implement this algorithm:

We’ll serialize the tree as follows:

  • We perform a depth-first traversal and serialize individual nodes to the stream.

  • We use a preorder (root, left, right) traversal here.

  • We serialize the marker to represent a NULL pointer, which helps deserialize the tree.

Consider the binary tree below as an example:

G 50 50 25 25 50->25 75 75 50->75 200 200 M5 M5 200->M5 350 350 200->350 M1 M1 25->M1 M2 M2 25->M2 M6 M6 350->M6 M7 M7 350->M7 M3 M3 M4 M4 100 100 100->50 100->200 75->M3 75->M4

The markers (M)(M*) are added to this tree to represent NULL nodes. The number with each marker, such as 11 in M1M1 or 22 in M2M2, represents only the relative position of a marker in the stream.

The serialized tree (preorder traversal) from the example above looks like the list below:

g array 100 50 25 M1 M2 75 M3 M4 200 M5 350 M6 M7

To deserialize the tree, we again use the preorder traversal and create a new node for every non-marker node. Encountering a marker indicates a NULL node.

Let’s run this approach on the above tree:

Created with Fabric.js 3.6.6 50 25 75 200 M5 350 M1 M2 M6 M7 M3 M4 100 Preorder traversal:

1 of 28

Created with Fabric.js 3.6.6 100 50 25 75 200 M5 350 M1 M2 M6 M7 100 M3 M4 Preorder traversal:

2 of 28

Created with Fabric.js 3.6.6 100 50 50 25 75 200 M5 350 M1 M2 M6 M7 100 M3 M4 Preorder traversal:

3 of 28

Created with Fabric.js 3.6.6 100 50 25 50 25 75 200 M5 350 M1 M2 M6 M7 100 M3 M4 Preorder traversal:

4 of 28

Created with Fabric.js 3.6.6 100 50 25 M1 50 25 75 200 M5 350 M1 M2 M6 M7 100 M3 M4 Preorder traversal:

5 of 28

Created with Fabric.js 3.6.6 50 25 75 200 M5 350 M1 M2 M6 M7 100 M3 M4 100 50 25 M1 M2 Preorder traversal:

6 of 28

Created with Fabric.js 3.6.6 100 50 25 M1 M2 75 50 25 75 200 M5 350 M1 M2 M6 M7 100 M3 M4 Preorder traversal:

7 of 28

Created with Fabric.js 3.6.6 100 50 25 M1 M2 75 M3 50 25 75 200 M5 350 M1 M2 M6 M7 100 M3 M4 Preorder traversal:

8 of 28

Created with Fabric.js 3.6.6 100 50 25 M1 M2 75 M3 M4 50 25 75 200 M5 350 M1 M2 M6 M7 100 M3 M4 Preorder traversal:

9 of 28

Created with Fabric.js 3.6.6 100 50 25 M1 M2 75 M3 M4 200 50 25 75 200 M5 350 M1 M2 M6 M7 100 M3 M4 Preorder traversal:

10 of 28

Created with Fabric.js 3.6.6 100 50 25 M1 M2 75 M3 M4 200 M5 50 25 75 200 M5 350 M1 M2 M6 M7 100 M3 M4 Preorder traversal:

11 of 28

Created with Fabric.js 3.6.6 100 50 25 M1 M2 75 M3 M4 200 M5 350 50 25 75 200 M5 350 M1 M2 M6 M7 100 M3 M4 Preorder traversal:

12 of 28

Created with Fabric.js 3.6.6 100 50 25 M1 M2 75 M3 M4 200 M5 350 M6 50 25 75 200 M5 350 M1 M2 M6 M7 100 M3 M4 Preorder traversal:

13 of 28

Created with Fabric.js 3.6.6 100 50 25 M1 M2 75 M3 M4 200 M5 350 M6 M7 50 25 75 200 M5 350 M1 M2 M6 M7 100 M3 M4 Preorder traversal:

14 of 28

Created with Fabric.js 3.6.6 Serialized tree: 100 50 25 M1 M2 75 M3 M4 200 M5 350 M6 M7

15 of 28

Created with Fabric.js 3.6.6 Serialized tree: 100 50 25 M1 M2 75 M3 M4 200 M5 350 M6 M7 100 Deserialized tree:

16 of 28

Created with Fabric.js 3.6.6 Deserialized tree: Serialized tree: 100 50 25 M1 M2 75 M3 M4 200 M5 350 M6 M7 50 100

17 of 28

Created with Fabric.js 3.6.6 Deserialized tree: Serialized tree: 100 50 25 M1 M2 75 M3 M4 200 M5 350 M6 M7 50 25 100

18 of 28

Created with Fabric.js 3.6.6 Deserialized tree: Serialized tree: 100 50 25 M1 M2 75 M3 M4 200 M5 350 M6 M7 50 25 M1 100

19 of 28

Created with Fabric.js 3.6.6 Deserialized tree: Serialized tree: 100 50 25 M1 M2 75 M3 M4 200 M5 350 M6 M7 50 25 M1 M2 100

20 of 28

Created with Fabric.js 3.6.6 Deserialized tree: Serialized tree: 100 50 25 M1 M2 75 M3 M4 200 M5 350 M6 M7 50 25 75 M1 M2 100

21 of 28

Created with Fabric.js 3.6.6 Deserialized tree: Serialized tree: 100 50 25 M1 M2 75 M3 M4 200 M5 350 M6 M7 50 25 75 M1 M2 M3 100

22 of 28

Created with Fabric.js 3.6.6 Deserialized tree: Serialized tree: 100 50 25 M1 M2 75 M3 M4 200 M5 350 M6 M7 50 25 75 M1 M2 M3 M4 100

23 of 28

Created with Fabric.js 3.6.6 Deserialized tree: Serialized tree: 100 50 25 M1 M2 75 M3 M4 200 M5 350 M6 M7 50 25 75 200 M1 M2 M3 M4 100

24 of 28

Created with Fabric.js 3.6.6 Deserialized tree: Serialized tree: 100 50 25 M1 M2 75 M3 M4 200 M5 350 M6 M7 50 25 75 200 M5 M1 M2 M3 M4 100

25 of 28

Created with Fabric.js 3.6.6 Deserialized tree: Serialized tree: 100 50 25 M1 M2 75 M3 M4 200 M5 350 M6 M7 50 25 75 200 M5 350 M1 M2 M3 M4 100

26 of 28

Created with Fabric.js 3.6.6 Deserialized tree: Serialized tree: 100 50 25 M1 M2 75 M3 M4 200 M5 350 M6 M7 50 25 75 200 M5 350 M1 M2 M6 M3 M4 100

27 of 28

Created with Fabric.js 3.6.6 Serialized tree: 100 50 25 M1 M2 75 M3 M4 200 M5 350 M6 M7 Deserialized tree: 50 25 75 200 M5 350 M1 M2 M6 M7 M3 M4 100

28 of 28

Let’s look at the code for this solution below:

main.py
BinaryTree.py
TreeNode.py
Serialize and Deserialize Binary Tree

Solution summary#

To recap, the solution to this problem can be divided into the following two parts:

  • Perform a depth-first traversal and serialize individual nodes to the stream.
    • Serialize the marker to represent a NULL pointer as it will help to deserialize the tree.
  • Deserialize the tree using preorder traversal. During traversal, create a new node for every non-marker node.

Time complexity#

The time complexity of the serialization function is O(n)O(n), and the time complexity of the deserialization function is also O(n)O(n), where nn is the number of nodes in the binary tree. Therefore, the overall time complexity of this solution is O(n)O(n).

Space complexity#

The space complexity of this solution is O(h)O(h), where hh is the height of the tree. This is because our recursive algorithm uses space on the call stack, which can grow to the height (h)(h) of the binary tree. The complexity will be O(logn)O(\log n) for a balanced tree and O(n)O(n) for a degenerate tree.

Additional thoughts#

Furthermore, if we know that our tree is almost a complete binary tree, we can serialize it similarly to how heap data structures are stored in an array. We can use the two rules below to serialize and deserialize data:

  • The children of a node at index ii are located at indices 2i2 * i and (2i)+1(2 * i) + 1.
  • The parent of a node at index ii is located at index i/2i / 2.

Serialize and Deserialize Binary Tree

Binary Tree Maximum Path Sum