Trie: Introduction

Let’s go over the Trie pattern, its real-world applications, and some problems we can solve with it.

Overview#

Trie is a tree data structure used for storing and locating keys from a set. The keys are usually strings that are stored character by character—each node of a trie corresponds to a single character rather than the entire key.

The order of characters in a string is represented by edges between the adjacent nodes. For example, in the string “are”, there will be an edge from node aa to node rr to node ee. That is, node aa will be the parent of node rr, and node rr will be the parent of node ee.

The following illustration can help you understand how the strings are stored:

Storing words in a trie

This way, additional space is not required for storing strings with common prefixes. We can keep moving down the tree until a new character, that’s not present in the node’s children, is encountered and add it as a new node. Similarly, searches can also be performed using depth first search by following the edges between the nodes. Essentially, in a trie, words with the same prefix or stem share the memory area that corresponds to the prefix.

To understand how tries are more efficient for storing and searching strings, consider a binary tree. The time complexity of a binary tree is O(logn)O(\log n), where we talk in terms of log\log base 22. Instead, think of a quaternary tree, where every node has a fan-out of four, so each node can have four children. The time complexity of this tree is still O(logn)O(\log n). However, now we’re talking in terms of log\log with base 44. That’s an improvement in the performance even if it’s by a constant factor. As our trees become wider and shorter, the operations become more efficient. This is because we don’t have to traverse as deep.

Binary Tree
Binary Tree
Quaternary Tree
Quaternary Tree
Viewer does not support full SVG 1.1

This is exactly the motivation behind a trie. What if we had an nn-ary tree with the fan-out equal to the number of unique values in the given dataset? For example, if we’re considering strings in English, the fan-out would be 2626, corresponding to the number of letters in the English language. This makes the tree wider and shorter! The maximum depth of the trie would be the maximum length of a word or string.

The following illustration shows how strings are stored in a trie:

Created with Fabric.js 3.6.6 Strings art hell hey how root

1 of 5

Created with Fabric.js 3.6.6 root a r t Strings art hell hey how

2 of 5

Created with Fabric.js 3.6.6 Strings art hell hey how root a h r t e l l

3 of 5

Created with Fabric.js 3.6.6 Strings art hell hey how root a h r t e l y l

4 of 5

Created with Fabric.js 3.6.6 Strings art hell hey how root a h r t e l y o w l

5 of 5

Examples#

The following examples illustrate some problems that can be solved with this approach:

Created with Fabric.js 3.6.6 Longest common prefix in the given strings is "ca." Strings cat call car root c a t l r l Longest common prefix

1 of 2

Created with Fabric.js 3.6.6 root c a t l r l Search suggestions String cal Returns "call"since it matches the prefix "cal."

2 of 2

Does my problem match this pattern?#

  • Yes, if either of these conditions is fulfilled:

    • We need to compare two strings to detect partial matches, based on the initial characters of one or both strings.
    • We wish to optimize the space used to store a dictionary of words. Storing shared prefixes once allows for significant savings.
  • No, if either of these conditions is fulfilled:

    • The problem statement restricts us from breaking down the strings into individual characters.
    • Partial matches between pairs of strings are not significant to solving the problem.

Real-world problems#

Many problems in the real world use the trie pattern. Let’s look at some examples.

  • Autocomplete system: One of the most common applications of trie is the autocomplete system in search engines, such as Google. This is the feature that prompts the search engine to give us some suggestions to complete our query when we start typing something in the search bar. These suggestions are given based on common queries that users have searched already that match the prefix we have typed.
  • Orthographic corrector: Ever seen pop-up suggestions or red lines under a word while you’re typing a message? That’s an orthographic corrector making suggestions and pointing out spelling mistakes by searching through a dictionary. It uses a trie data structure for efficient searches and retrievals from the available database.

Strategy time!#

Match the problems that can be solved using the trie pattern.

Note: Select a problem in the left-hand column by clicking it, and then click one of the two options in the right-hand column.

Match The Answer
Select an option from the left-hand side

Design an autocomplete system for a search engine.

Explanation

Trie

Find all the words in a dictionary that start with “aba”.

Explanation

Some other pattern

Print all the permutations of a string.

Explanation

Fetch the top 55 elements repeatedly from a stream of numbers.

Explanation

Solution: Bus Routes

Search Suggestions System