Solution: Maximum Profit in Job Scheduling

Let's solve the Maximum Profit in Job Scheduling problem using the Dynamic Programming pattern.

We'll cover the following

Statement
Solution
- Naive approach
- Optimized approach using dynamic programming

Statement#

Suppose there are $n$ jobs, each with a start time, end time, and a corresponding profit. The goal is to schedule these jobs in a way that the sum of their profits is maximized while ensuring that no two jobs overlap in time. Given the arrays of start_time, end_time, and profit, calculate the maximum profit that can be obtained.

Note: Selecting a job that finishes at time $t$ means you can pick the next job that starts at the same time $t$ or later.

Constraints:

$1 \leq$ start_time.length $==$ end_time.length $==$ profit.length $\leq$ $10^3$
$1 \leq$ start_time[i] $<$ end_time[i] $\leq 10^5$
$1 \leq$ profit[i] $\leq 10^4$

Solution#

So far, you have probably brainstormed some approaches and have an idea of how to solve this problem. Let’s explore some of these approaches and figure out which one to follow based on considerations such as time complexity and any implementation constraints.

Naive approach#

In the naive approach, we systematically explore all possible combinations of scheduling jobs to find the one that provides us the maximum profit. For each of the $n$ jobs, there are two choices:

Schedule the job.
Do not schedule the job.

This results in $2^n$ potential combinations. We evaluate each combination, calculating the total profit achieved while ensuring that no two jobs overlap in their time frames, as that would violate the constraint. After checking all combinations, we select the one that maximizes profit and adheres to the non-overlapping constraint, representing the optimal job schedule.

Since this approach requires an exhaustive search process for $2^n$ potential combinations, it leads to a time complexity of $O(2^n)$ . Then, to calculate the total profit, the algorithm checks for overlaps by comparing each pair of jobs in the combination. This overlap-checking process takes $O(n^2)$ time for each combination. As a result, the overall time complexity of this approach becomes $O(2^n \times n^2)$ . However, the space complexity of this solution is $O(n)$ because the maximum length of a combination during the process can be $n$ .

Optimized approach using dynamic programming#

When we read the problem statement, there are a couple of points we need to consider:

The problem requires us to find the schedule that provides maximum profit, which, in other words, is the optimal solution.
A job cannot be scheduled if an overlapping job has already been scheduled. This implies that the task of scheduling any job under consideration is dependent on the previously scheduled jobs, which represents an optimal substructure property.

The above points signify that this problem can be solved using Dynamic Programming since it exhibits characteristics typical of such problems. We can find the optimal solution for scheduling jobs to maximize the associated profit by breaking the problem into subproblems. In each subproblem, we'll consider scheduling one job at a time, taking into account the already scheduled jobs and their conflicts. This allows us to gradually build up the optimal schedule based on the optimal solutions to smaller subproblems.

Another important aspect of solving this problem is knowing when two jobs would overlap. Let's say there are two jobs, $A$ and $B$ , then both will overlap if:

We use binary search to efficiently determine whether any of the previously scheduled jobs overlap with the current job under consideration. We do this by comparing the end time of the previously scheduled job with the current job's start time.

In this solution, we use binary search within dynamic programming to find the final answer. The algorithm works as follows:

Since the input data for $n$ jobs is provided in three separate arrays, it is difficult to correctly sort and manage it, specifically when dealing with an individual job. Therefore, we combine all the information related to a particular job in a tuple, such as (start_time[i], end_time[i], profit[i]), and then store these tuples in a new array, jobs. Next, we sort the jobs array with respect to each job's end time in ascending order. This is to minimize the chance of selecting jobs with overlapping time ranges because now jobs with earlier end times will be considered before jobs with later end times.

Now, we create an array, dp, of size $n$ to store the results of each subproblem. Each element, dp[i], will store the maximum profit that can be obtained by considering only the first i jobs. To fill up dp, we iterate through the sorted jobs in the jobs array. For each job, jobs[i], we consider two options:

Include the current job: Here, we count the current job's profit, jobs[i][2], towards the calculation of the maximum profit, included_profit. To do this, we add the current job's profit to the profit obtained from jobs that don't overlap with the current job. Now, how do we find the value of the latter? We can't simply use the value of dp[i-1] because first, we need to make sure that the previous job doesn't overlap with the current one. This is where binary search plays a vital role. We use binary search to find the index, prev_valid_job, of the latest possible job that doesn't overlap with the current job's start time. If there is no such job, binary search returns $-1$ . Once we have the required index, we can use the profit against it, dp[prev_valid_job], to calculate the value of the maximum profit, included_profit $=$ jobs[i][2] $+$ dp[prev_valid_job].
Exclude the current job: Here, we don't count the current job's profit towards the calculation of the maximum profit, excluded_profit. Instead, we just consider the profit value in the previous cell, dp[i-1], which represents the maximum profit that can be obtained up to the previous job, jobs[i-1].

Note: For the first job, when i $=$ 0, we set included_profit to its profit value and excluded_profit to 0 since there is no other job in the comparison.

Once we have calculated the values of included_profit and excluded_profit, we set dp[i] to the maximum value among the two. This is to choose the option that yields the higher profit: including the current job or excluding it. After iterating through all jobs, the last element, dp[n-1], will contain the final answer.

Let’s look at the following illustration to get a better understanding of the solution:

def job_scheduling(start_time, end_time, profit):
    jobs = zip(start_time, end_time, profit)
    jobs = sorted(jobs, key=lambda x: x[1])
    n = len(jobs)
    dp = [0] * n
    for i in range(n):
        included_profit = jobs[i][2]
        prev_valid_job = binary_search(jobs, i)
        if prev_valid_job != -1:
            included_profit += dp[prev_valid_job]
        if i > 0:
            excluded_profit = dp[i - 1] 
        else:
            excluded_profit = 0
        dp[i] = max(included_profit, excluded_profit)
    return dp[n - 1]
def binary_search(jobs, current_job):
    left = 0 
    right = current_job - 1
    prev_valid_index = -1
    while left <= right:
        mid = (left + right) // 2
        if jobs[mid][1] <= jobs[current_job][0]:
            prev_valid_index = mid
            left = mid + 1

Enter to Rename, Shift+Enter to Preview

Maximum Profit in Job Scheduling

Solution summary#

Combine the input data into a single array of tuples, where each tuple will have the start time, end time, and the associated profit of a particular job.
Sort this newly created array based on the end times of jobs in ascending order.
Initialize an array, dp, to store the maximum profit at each job index.
Iterate through the sorted list of jobs, one job at a time (from the first job to the last job). For each job, calculate the maximum profit achievable by considering two options:
- Include the current job, which involves adding its profit to the maximum profit of the last non-overlapping job. Find the last non-overlapping job using the binary search.
- Exclude the current job, which takes the maximum profit obtained from the previous job.
Update the dp array with the value that is the maximum profit among the two options—including or excluding the current job. Continue doing this for all jobs.
After processing all jobs, the last element, dp[n - 1], will contain the maximum profit that can be obtained by scheduling the jobs while ensuring no two jobs overlap in their time ranges.

Time complexity#

Sorting the jobs based on their end times using Python's sorted function takes $O(n \log n)$ , where $n$ is the total number of jobs. The loop that iterates over all the $n$ jobs and performs a binary search on each takes $O(n\log n)$ as well. Therefore, the overall time complexity of this solution is $O(n \log n)$ .

Space complexity#

The space taken by the arrays jobs and dp is $O(n)$ . Python's sorted function also takes $O(n)$ space. Therefore, the overall space complexity of this solution is $O(n)$ .

Maximum Profit in Job Scheduling

Maximum Subarray