Solution: Longest Substring without Repeating Characters

Let's solve the Longest Substring without Repeating Characters problem using the Sliding Window pattern.

Statement#

Given a string, input_str, return the length of the longest substring without repeating characters.

Constraints:

  • 11 \leq input_str.length 1000\leq 1000
  • input_str consists of English letters, digits, symbols, and spaces.

Solution#

So far, you’ve probably brainstormed some approaches on how to solve this problem. Let’s explore some of these approaches and figure out which one to follow while considering time complexity and any implementation constraints.

Naive approach#

The naive approach is to explore all possible substrings. For each substring, we check whether any character in it is repeated. After checking all the possible substrings, the substring with the longest length that satisfies the specified condition is returned.

The total time complexity of this solution is O(n3)O(n^3). To explore all possible substrings, the time complexity is O(n2)O(n^2), and to check whether all the characters in a substring are unique or not, the time complexity can be approximated to O(n)O(n). So, the total time complexity is O(n2)O(n)=O(n3)O(n^2) * O(n) = O(n^3). The space complexity of this naive approach is O(min(m,n))O(min(m, n)), where mm is the size of the character set and nn is the size of the string.

Optimized approach using sliding window#

We use a modified version of the classic sliding window method. Instead of a fixed-size window, we allow our window to grow, looking for the window that corresponds to the longest substring without repeating characters.

We initialize an empty hash map along with a variable to track character indices and the starting point of the window. Next, we traverse the string character by character. During each iteration, the current character is checked to see if it exists in the hash map. If it does not exist, it is added to the hash map along with its index. However, if the current character already exists in the hash map and its index falls within the current window, it indicates that a repeating character has been discovered. In this case, the start of the window is updated to the previous location of the current element and incremented, and the length of the current window is calculated. The longest substring seen so far is updated if the length of the current window is greater than its current value. Finally, the length of the longest substring is returned as the result.

Here’s how we implement this technique.

  1. We initialize the following set of variables to 0 to keep track of the visited elements.

    • window_start: The starting index of the current substring.

    • window_length: The length of the current substring.

    • longest: The length of the longest substring.

  2. For every element in the string, we check whether or not it’s present in the hash map.

    • If it isn’t present, we store it in the hash map such that the key is the current element and the value is its index in the string.
    • If it’s already present in the hash map, then the element may have already appeared in the current substring. For this, we check if the previous occurrence of the element is before or after the starting index, window_start, of the current substring.

      • If it’s after window_start, we calculate the current substring’s length, window_length, as the difference between the current index and window_start. If longest is less than the new window_length, we set longest as window_length.

      • To prevent the repetition of the current element in the current window, the next candidate substring will be set to start from just after the last occurrence of the current element.

    • We then update the value of the corresponding key in the hash map, setting it to the index of the current element.
  3. After traversing the entire sequence of elements, longest holds the length of the longest substring without repeating characters.

The following illustration shows these steps in detail:

Created with Fabric.js 3.6.6 Input a b c d b e a Value Index Current length = 0 Longest = 0 We’re given an input array of characters. We traverse over the array and use a hash map to maintain the latest indexes of each character. Current length and longest keep track of the substring lengths. In all the coming slides, purple is the current window color, blue is the index being considered, and yellow is for theindexes that are no longer in the current window.

1 of 12

Created with Fabric.js 3.6.6 Input a b c d b e a Value Index a 0 Current length = 0 Longest = 0 Since the current character is not present in the hash map, we add it to the hash map and store its index.

2 of 12

Created with Fabric.js 3.6.6 Input a b c d b e a Value Index a 0 b 1 Current length = 0 Longest = 0 Since the current character is not present in the hash map, we add it to the hash map and store its index.

3 of 12

Created with Fabric.js 3.6.6 Input a b c d b e a Value Index a 0 b 1 c 2 Current length = 0 Longest = 0 Since the current character is not present in the hash map, we add it to the hash map and store its index.

4 of 12

Created with Fabric.js 3.6.6 Input a b c d b e a Value Index a 0 b 1 c 2 d 3 Current length = 0 Longest = 0 Since the current character is not present in the hash map, we add it to the hash map and store its index.

5 of 12

Created with Fabric.js 3.6.6 Input a b c d b e a Value Index a 0 b 1 c 2 d 3 Current length = 0 Longest = 0 The current character is already present in the hash map, which means it's repeated. We update its corresponding index value to the latest index.

6 of 12

Created with Fabric.js 3.6.6 Input a b c d b e a Value Index a 0 b 4 c 2 d 3 Current length = 4 Longest = 4 We update the current length with the length of the substring, and longest with max (longest, current length). The next substring starts after the last occurence of the current element. In this case, it now starts from index 2.

7 of 12

Created with Fabric.js 3.6.6 Input a b c d b e a Value Index a 0 b 4 c 2 d 3 e 5 Current length = 4 Longest = 4 Since the current character is not present in the hash map, we add it to the hash map and store its index.

8 of 12

Created with Fabric.js 3.6.6 Input a b c d b e a Value Index a 0 b 4 c 2 d 3 e 5 Current length = 4 Longest = 4 The current character is already present in the hash map, which means it's repeated. We'll update its corresponding index value to the latest index.

9 of 12

Created with Fabric.js 3.6.6 Input a b c d b e a Value Index a 6 b 4 c 2 d 3 e 5 Current length = 4 Longest = 4 Since the current character last occured before the start of the current substring, it's not repeated, and therefore, we add it in our substring.

10 of 12

Created with Fabric.js 3.6.6 Input a b c d b e a Value Index a 0 b 4 c 2 d 3 e 5 Current length = 5 Longest = 5 Update current length with the length of the substring and longest with max(longest, current length).

11 of 12

Created with Fabric.js 3.6.6 Input a b c d b e a Value Index a 0 b 4 c 2 d 3 e 5 Current length = 5 Longest = 5 Since we've reached the end of the array, we return the longest substring length. Substring = c d b e a, length = 5

12 of 12

Let’s look at the code for the solution:

Longest Substring without Repeating Characters

Solution summary#

To recap, the solution to this problem can be divided into the following parts:

  1. Traverse the input string.

  2. Use a hash map to store elements along with their respective indexes.

    • If the current element is present in the hash map, check whether it’s already present in the current window. If it is, we have found the end of the current window and the start of the next. We check if it’s longer than the longest window seen so far and update it accordingly.

    • Store the current element in the hash map with the key as the element and the value as the current index.

  3. At the end of the traversal, we have the length of the longest substring with all distinct characters.

Time complexity#

The time complexity of this solution is O(n)O(n), where nn is the length of the input string. This is because we have to iterate over all the elements in the string.

Space complexity#

The space complexity of this solution is O(1)O(1), which is the space occupied by the hash map. This is because there’s only a limited number of unique characters that could appear in the input string.

Longest Substring without Repeating Characters

Union Find: Introduction