Solution: First Bad Version

Statement#

The latest version of a software product fails the quality check. Since each version is developed upon the previous one, all the versions created after a bad version are also considered bad.

Suppose you have n versions with the IDs [1,2,...,n][1, 2, ..., n], and you have access to an API function that returns TRUE if the argument is the ID of a bad version.

Find the first bad version that is causing all the later ones to be bad. Additionally, the solution should also return the number of API calls made during the process and should minimize the number of API calls too.

Constraints:

  • 11 \leq first bad version \leq n 2311\leq 2^{31} - 1

Solution#

So far, you’ve probably brainstormed some approaches and have an idea of how to solve this problem. Let’s explore some of these approaches and figure out which one to follow based on considerations such as time complexity and any implementation constraints.

Naive approach#

The naive approach is to find the first bad version in the versions range by linear search. We traverse the whole version range one element at a time until we find the target version.

The time complexity of this approach is O(n)O(n), because we may have to search the entire range in this process. This approach ignores an important piece of information: the range of version numbers is sorted from 11 to nn. Let’s see if we can use this information to design a faster solution.

Since the versions range is sorted, binary search is the most efficient approach for finding the first bad version. It executes in O(logn)O(\log{n}) time, which is faster than the alternate linear scanning method.

Our goal is to find the first bad version by making the minimum number of calls to the API function. We use the binary search algorithm. The idea is to check the middle version to see if it’s bad. If it’s good, this means that the first bad version occurs later in the range, allowing us to entirely skip checking the first half of the range. If it’s bad, we need to check the first half of the range to find the first bad version. Therefore, we can skip checking the second half of the range.

When we go to check the identified half of the range, we use the same approach again. We check the middle version to figure out which half of the current range to check.

Note: In the following section, we will gradually build the solution. Alternatively, you can skip straight to just the code.

Step-by-step solution construction#

Let’s start with the simplest step. Since we know the number of versions and also the versions range is sorted, we use two pointers, first and last. first will point to the first version, which is 11, and last will point to the last version which is n. Additionally, we initialize a variable, calls, with zero that counts the number of API calls made by the solution to find the first bad version.

First Bad Version

Next, we can find the middle version’s ID. We do it in a while loop because we repeat this step until the first becomes equal to the last. We will pass the middle to the API function, which will tell us whether or not the middle version is bad.

First Bad Version

Call the API fuction, is_bad_version, to check if whether or not that version is bad. Increment the calls variable on every API call so that at the end, we can return the number of API calls. In this way, each API call helps us divide our data set in half and narrow down the search space. Now, we just have to keep iterating and narrowing down until the version is found. Here’s how we divide the search space:

  • If the middle version is bad, we know that this is either the first bad version or that the first bad version occurred prior to this version. Therefore, we search for the first bad version from 1 to mid-1.
  • If the middle version is not bad, we need to search for the first bad version from mid+1 to n.

After we’ve found the first bad version, we will return a tuple containing first bad version and the minimum number of API calls.

The slides below illustrate how we would like the algorithm to run:

Created with Fabric.js 3.6.6 Versions 1 2 3 4 5 6 7 8 calls 0 Given the above versions range, we will find the first bad version using minimum number of API calls.

1 of 5

Created with Fabric.js 3.6.6 Versions 1 2 3 4 5 6 7 8 first mid last calls 1 The mid version is not bad, so we will divide the search space and focus on the second half of the list. Increment calls.

2 of 5

Created with Fabric.js 3.6.6 1 2 3 4 5 6 7 8 first mid last Versions calls 2 The mid version is bad, so this could be the first bad version or the first bad version could be before the mid. Increment calls.

3 of 5

Created with Fabric.js 3.6.6 Versions 1 2 3 4 5 6 7 8 first mid last calls 3 The mid version is not bad, so we will shift the first pointer ahead.Increment calls.

4 of 5

Created with Fabric.js 3.6.6 Versions 1 2 3 4 5 6 7 8 first last calls 3 Since the first now equals the last, the search is over and we return the first. Return the first bad version along with calls (6, 3).

5 of 5

First Bad Version

Just the code#

Here’s the complete solution to this problem:

First Bad Version

Solution summary#

To recap, the solution to this problem can be divided into the following parts:

  1. Assign two pointers at the IDs of the first and last versions.

  2. Calculate the mid and check if that mid version is bad.

  3. If the mid version is bad, search for the first bad version in the first half of the range of versions.

  4. If the mid version is good, search in the second half of the range of versions.

  5. Repeat the steps to check the mid version and to divide the range of versions in two until the first and last converge.

Time complexity#

This algorithm takes O(logn)O(\log{n}) time, because we keep dividing the searching space in half at each iteration, where nn is the number of versions.

Space complexity#

The algorithm uses O(1)O(1) space.

First Bad Version

Search in Rotated Sorted Array