Educative: Interactive Courses for Software Developers

Statement#

The latest version of a software product fails the quality check. Since each version is developed upon the previous one, all the versions created after a bad version are also considered bad.

Suppose you have n versions with the IDs $[1, 2, ..., n]$ , and you have access to an API function that returns TRUE if the argument is the ID of a bad version.

Find the first bad version that is causing all the later ones to be bad. Additionally, the solution should also return the number of API calls made during the process and should minimize the number of API calls too.

Constraints:

$1 \leq$ first bad version $\leq$ n $\leq 2^{31} - 1$

Solution#

So far, you’ve probably brainstormed some approaches and have an idea of how to solve this problem. Let’s explore some of these approaches and figure out which one to follow based on considerations such as time complexity and any implementation constraints.

Naive approach#

The naive approach is to find the first bad version in the versions range by linear search. We traverse the whole version range one element at a time until we find the target version.

The time complexity of this approach is $O(n)$ , because we may have to search the entire range in this process. This approach ignores an important piece of information: the range of version numbers is sorted from $1$ to $n$ . Let’s see if we can use this information to design a faster solution.

Optimized approach using modified binary search#

Since the versions range is sorted, binary search is the most efficient approach for finding the first bad version. It executes in $O(\log{n})$ time, which is faster than the alternate linear scanning method.

Our goal is to find the first bad version by making the minimum number of calls to the API function. We use the binary search algorithm. The idea is to check the middle version to see if it’s bad. If it’s good, this means that the first bad version occurs later in the range, allowing us to entirely skip checking the first half of the range. If it’s bad, we need to check the first half of the range to find the first bad version. Therefore, we can skip checking the second half of the range.

When we go to check the identified half of the range, we use the same approach again. We check the middle version to figure out which half of the current range to check.

def first_bad_version(n):
    # Assigning first pointer with the first version that is 1
    first = 1
    # Assigning last pointer with n that is the number of versions
    last = n
    calls = 0
    print("\n\tVersions: ", end="")
    for x in range(first, last):
        print(x, end=", ")
    print(last, "\n")
    print("\tfirst:", first)
    print("\tlast:", last)
    return first, calls
# Driver code
def main():
    global v
    test_case_versions = [38, 13, 29, 40, 23]
    first_bad_versions = [28, 10, 10, 28, 19]
    for i in range(len(test_case_versions)):
        v = first_bad_versions[i]
        print(i + 1,  ".\tNumber of versions: ", test_case_versions[i], sep="")
        first_bad_version(test_case_versions[i])
        print("-"*100)
if __name__ == '__main__':

Enter to Rename, Shift+Enter to Preview

First Bad Version

def is_bad_version(s):
    return s >= v
    
def first_bad_version(n):
    # Assigning first pointer with the first version that is 1
    first = 1
    # Assigning last pointer with n that is the number of versions
    last = n
    calls = 0
    print("\n\tVersions: ", end="")
    for x in range(first, last):
        print(x, end=", ")
    print(last, "\n")
    # Binary search to find the target version
    while first < 2:  # Last should be equal to n.But here last = 2. Because we are calculating mid only one time here just to show how it works
        mid = first + (last - first) // 2;
        first += 1
        print("\n\tmid: ", mid)
        # is_bad_version() is the API function that returns true
        # or false depending upon whether the provided version ID
        # is bad or good
        if (is_bad_version(mid)):
            print(f"\tVersion {mid} is bad, so we search in the first half of the list.\n")
        else:
            print(f"\tVersion {mid} is good, so we search in the second half of the list.\n")
    return first, calls
# Driver code

Enter to Rename, Shift+Enter to Preview

First Bad Version

Call the API fuction, is_bad_version, to check if whether or not that version is bad. Increment the calls variable on every API call so that at the end, we can return the number of API calls. In this way, each API call helps us divide our data set in half and narrow down the search space. Now, we just have to keep iterating and narrowing down until the version is found. Here’s how we divide the search space:

If the middle version is bad, we know that this is either the first bad version or that the first bad version occurred prior to this version. Therefore, we search for the first bad version from 1 to mid-1.
If the middle version is not bad, we need to search for the first bad version from mid+1 to n.

After we’ve found the first bad version, we will return a tuple containing first bad version and the minimum number of API calls.

The slides below illustrate how we would like the algorithm to run:

def is_bad_version(s):
    return s >= v
def first_bad_version(n):
    # Assigning first pointer with the first version that is 1
    first = 1
    # Assigning last pointer with n that is the number of versions
    last = n
    calls = 0
    print("\n\tVersions: ", end="")
    for x in range(first, last):
        print(x, end=", ")
    print(last, "\n")
    # Binary search to find the target version
    while first <= last:  # Now we will search the entire list
        mid = first + (last - first) // 2;
        # is_bad_version() is the API function that returns true
        # or false depending upon whether the provided version ID
        # is bad or good
        if is_bad_version(mid):  # Calling the API function to determine if the version is bad
            last = mid - 1
            print(f"\tVersion {mid} is bad, so we search in the first half of the list.\n")
        else:
            first = mid + 1
            print(f"\tVersion {mid} is good, so we search in the second half of the list.\n")
        calls += 1
        print("\t{:<4}{:<5}{:<4}{:<7}{:<4}{:<7}{:<4}{:<8}{:<5}".format( \
            "|", "first", "|", "last", "|", "mid", "|", "calls", "|"))

Enter to Rename, Shift+Enter to Preview

First Bad Version

def is_bad_version(s):
    return s >= v
def first_bad_version(n):
    first = 1
    last = n
    calls = 0
    while first <= last:  
        mid = first + (last - first) // 2;
        if is_bad_version(mid):  
            last = mid - 1
        else:
            first = mid + 1
        calls += 1
    return first, calls
# Driver code
def main():
    global v
    test_case_versions = [38, 13, 29, 40, 23]
    first_bad_versions = [28, 10, 10, 28, 19]
    for i in range(len(test_case_versions)):
        v = first_bad_versions[i]
        if i > 0:
            print()
        print(i + 1,  ".\tNumber of versions: ", test_case_versions[i], sep="")
        result = first_bad_version(test_case_versions[i])
        print("\n\tFirst bad version: ",
              result[0], ". Found in ", result[1], " API calls.", sep="")
        print("-"*100)

Enter to Rename, Shift+Enter to Preview

First Bad Version

Solution summary#

To recap, the solution to this problem can be divided into the following parts:

Assign two pointers at the IDs of the first and last versions.
Calculate the mid and check if that mid version is bad.
If the mid version is bad, search for the first bad version in the first half of the range of versions.
If the mid version is good, search in the second half of the range of versions.
Repeat the steps to check the mid version and to divide the range of versions in two until the first and last converge.

Time complexity#

This algorithm takes $O(\log{n})$ time, because we keep dividing the searching space in half at each iteration, where $n$ is the number of versions.

Space complexity#

The algorithm uses $O(1)$ space.

Solution: First Bad Version

Statement#

Solution#

Naive approach#

Optimized approach using modified binary search#

Step-by-step solution construction#

Just the code#

Solution summary#

Time complexity#

Space complexity#