Solution: Last Day Where You Can Still Cross

Let's solve the Last Day Where You Can Still Cross problem using the Union Find pattern.

Statement#

You are given two integers, rows and cols, which represent the number of rows and columns in a 1-based binary matrix. In this matrix, each 00 represents land, and each 11 represents water.

Initially, on day 0, the whole matrix will just be all 0s, that is, all land. With each passing day, one of the cells of this matrix will get flooded and, therefore, will change to water, that is, from 00 to 11. This continues until the entire matrix is flooded. You are given a 1-based array, water_cells, that records which cell will be flooded on each day. Each element water_cells[i]=[ri,ci]water\_cells[i] = [r_{i},c_{i}] indicates the cell present at the rith{r_{i}}^{th} row and cith{c_{i}}^{th} column of the matrix that will change from land to water on the ithi^{th} day.

We can cross any cell of the matrix as long as it’s land. Once it changes to water, we can’t cross it. To cross any cell, we can only move in one of the four cardinal directions. Given the number of rows and columns of a 1-based binary matrix and a 1-based array, water_cells, you are required to find the last day where you can still cross the matrix, from top to bottom, by walking over the land cells only.

Note: You can start from any cell in the top row, and you need to be able to reach just one cell in the bottom row for it to count as a crossing.

Constraints:

  • 22 \leq rows,, cols 2×104\leq 2 \times 10^4
  • 44 \leq rows ×\times cols 2×104\leq 2 \times 10^4
  • water_cells.length ==== rows ×\times cols
  • 1ri1 \leq r_{i} \leq rows
  • 1ci1 \leq c_{i} \leq cols
  • All values of water_cells are unique.

Solution#

Our aim is to find the maximum number of days where we can cross a 1-based binary matrix, let’s say, matrix, while it is being flooded one cell at a time. We need to cross matrix from top to bottom using a continuous path consisting only of land cells. If, at any time, we encounter a series of connected water cells from the leftmost side of the matrix to its rightmost side that blocks our continuous path of land cells, we won’t be able to cross the matrix.

In our solution, we’ll look for a single component of connected water cells from the left to the right of matrix, on each day. Once encountered, we’ll return this day as the last day where we can still cross matrix.

Note: We are not subtracting 1 from our final answer, because, in the context of this problem, we are starting from day 0, so we don’t need to adjust the value of days.

Since we are required to find a series of connected water cells, we can consider the cells in the matrix to be vertexes in a graph—each vertex connected to its neighbors—and then try to identify connected components consisting only of water cells using union find. The union find pattern is designed to group elements into sets based on a specified property. The pattern uses a disjoint set data structure, such as an array, to keep track of which set each element belongs to. In our case, there are two properties that define whether any two cells should be grouped together:

  1. They both need to be water cells.
  2. There must be a path between them consisting only of water cells, such that the movement from one water cell to its neighboring water cell is along one of eight directions: up, down, left, right, up-left, up-right, down-left, and down-right.

Note: As per the problem statement, there is a restriction on the connectivity of land cells. A valid path of land cells will only contain land cells connected to one of its four neighboring land cells, not eight. However, when checking whether two water cells are connected, we need to take into account eight directions, as explained below.

The most obvious way to block all paths from top to bottom is for all the cells in any row to be flooded. For example, in the matrix on the right, flooding the second row does the job.

However, this isn’t the only way to stop all the paths from the top row to the bottom.

Focusing on each column individually, we might think that flooding at least one cell in each column would be enough, yet as we can see in the matrix on the right, this doesn’t quite work. There are two paths to the bottom row from cell 1010: via cell 1212 or via cell 99.

We have blocked the path to cell 1616 by flooding cell 1212, and we can similarly flood cell 99 to block the path to cell 1313. Here, the water cells 1212 and 1515 are neighbors, not along a cardinal direction, but along an ordinal or intercardinal direction. The same applies to cells 99 and 1414.

This illustrates why, when checking whether two water cells are connected, we need to check in eight directions: up, down, left, right, up-left, up-right, down-left, and down-right.

Now that we have understood why the connectivity of water cells needs to be checked along eight directions, we can use the disjoint set data structure to track which water cells are connected, i.e., which water cells belong to the same connected component.

To use the union find pattern, we create and use the UnionFind class that has two primary methods:

  • union(node1, node2): This merges the sets of node1 and node2 into one by changing the representative of node1 to the representative of node2.

  • find(node): This finds and returns the representative of the set that contains the given node. It uses path compression, which speeds up the subsequent find operations by changing the parent of the visited node to the representative of the subset. This reduces the length of the path of that node to the representative, ensuring we don’t have to traverse all the intermediate nodes on future find operations.

Note: In terms of the tree representation of disjoint set union (DSU) data structure, a representative is an element in each set that resides at the root of the tree and represents that specific set.

Our solution will first initialize a variable, day, to 00, since we are starting from day 00. Next, we’ll create a rows ×\times cols grid, called matrix, with all cells initialized to 00s (since none of the cells have been flooded on day 00). Then, proceeding from day 11, we’ll start filling the matrix with water cells as per the given water_cells array. The value in each cell in matrix, identified by the current entry in water_cells, is changed from 00 to 11. For example, if we are on day 22 and the value of water_cells[2] is [1,2][1, 2], we will set the value of matrix[1][2] to 11, indicating that it is now flooded with water.

Since we are solving this problem using the union find pattern, we initialize a UnionFind object, which results in the creation of a custom disjoint set data structure: an array, reps, of size (rows * cols) + 2. The array reps is essentially a 1D array representation of matrix. The two additional elements in reps represent two extra virtual nodes—one at the leftmost side of the matrix and the other at the rightmost side of the matrix. To operate this mapping, we use a helper function, find_index(), which converts the coordinates of a cell (i,j)(i, j) in matrix to its corresponding unique index in reps. Initially, every element is its own representative, so the value at every index in reps is the index itself. During the execution of the union operation, when two elements are united, their representatives are set to the same value.

We use the virtual nodes to efficiently check whether the first and the last columns of matrix are connected. In simpler words, these nodes help identify whether we have a series of connected water cells from the left to the right side of matrix. If a water cell is present in the first column of matrix, we use the union operation to set the left virtual node as its representative. Similarly, if a water cell is in the last column, we use the union operation to set the right virtual node as its representative.

Every day, we flood another cell in the matrix, changing its value to 11. At this point, we check whether any of its eight neighboring cells is also a water cell. If it is, we use the union operation to connect it with the recently flooded cell. The union operation will connect both the cells by making their representatives the same in the array, reps. As mentioned above, if the recently flooded cell is at the left or right boundary of matrix, it will connect with the corresponding virtual node.

Next, we check to see if the flooding of this cell has created a continuous component connecting the leftmost side of matrix with its rightmost side. For this purpose, we use the find operation to check whether the left and right virtual nodes are connected, that is, whether or not they have the same representative.

If the two sides are connected, we can no longer cross the matrix from top to bottom, and the current value of day is our answer. Otherwise, we increment the value of day and repeat the procedure with the next cell in water_cells.

The slides below help to understand the solution in a better way.

canvasAnimation-image

1 of 19

canvasAnimation-image

2 of 19

canvasAnimation-image

3 of 19

canvasAnimation-image

4 of 19

canvasAnimation-image

5 of 19

canvasAnimation-image

6 of 19

canvasAnimation-image

7 of 19

canvasAnimation-image

8 of 19

canvasAnimation-image

9 of 19

canvasAnimation-image

10 of 19

canvasAnimation-image

11 of 19

canvasAnimation-image

12 of 19

canvasAnimation-image

13 of 19

canvasAnimation-image

14 of 19

canvasAnimation-image

15 of 19

canvasAnimation-image

16 of 19

canvasAnimation-image

17 of 19

canvasAnimation-image

18 of 19

canvasAnimation-image

19 of 19

Let’s take a look at the code for this solution below:

main.py
union_find.py
Last Day Where You Can Still Cross

Time complexity#

The time complexity of this solution is O((mn)α(mn))O((m \cdot n) \cdot \alpha(m \cdot n)), where mm represents the number of rows in the matrix, nn represents the number of columns in the matrix, and α\alpha is the inverse Ackermann function that grows very slowly and whose maximum value for all practical purposes is 44.

The initialization of the UnionFind object takes O(mn)O(m \cdot n) time, since it involves creating an array of size (mn)+2(m \cdot n) + 2 and initializing every element to a unique index.

There may be, in the worst case, O(mn)O(m⋅n) entries in water_cells. For each entry, there can be a maximum of eight union operations to merge the flooded cell with adjacent water cells, and each union operation takes O(α(mn))O(α(m⋅n)) time on average.

Therefore, the overall time complexity of this solution is O((mn)α(mn))O((m \cdot n) \cdot \alpha(m \cdot n)).

Space complexity#

The space complexity of the solution above is O(mn)O(m \cdot n).

Alternative solutions#

Now, let’s see what are some other ways to solve this problem.

  • Union find on land cells:

    This is similar to the solution presented above. Here, we start at the ending state of matrix with all the cells flooded. Then, moving from the end of the water_cells array towards its start, we start rolling back the flooding (one cell at a time), that is, changing that cell’s value from 11 to 00. After changing a cell from water to land, we check the land cells adjacent to it in all four cardinal directions, and if any are found, we connect them to the recently reverted cell. To check whether we got a single connected component of land cells from top to bottom of matrix, we keep two virtual nodes—one at the top and the other at the bottom. The moment we find these virtual nodes connected to each other, we’ll return this day as the output.

    The time and space complexity of this solution is the same as the solution above.

  • Breadth-first search (BFS) with binary search:

    In this solution, for each new flooded cell on the ithi^{th} day, we use BFS to figure out whether we have a path of land cells from top to bottom to cross matrix. There is a total of (mn)(m \cdot n) cells that needs to be flooded as per water_cells, and finding a path from the top to the bottom row for each of them would be highly inefficient. Therefore, we use binary search to reduce this number. Our binary search method is guided by these two observations:

    • If we can cross on day ii, we can cross on any day before that day.
    • If we can’t cross on day ii, we cannot cross on any day after that day.

    For example, if there are 1616 cells in water_cells, then, instead of searching for paths on each of the 1616 days, we start from the middle cell, the 8th8^{th} cell. We flood all the cells mentioned in water_cells up to the 8th8^{th} day and then check if a path to cross matrix exists. If it does, then we flood all the cells mentioned up to the 12th12^{th} day, and check again. If no path exists, we roll back the flooding to the 10th10^{th} day and check again, and so on.

    To find a continuous path from top to bottom to cross matrix, we start the BFS from the top row of matrix and insert all its land cells into a queue. Then, we start to dequeue the elements in the queue one by one until it’s empty. Whenever we dequeue a cell, we check all of its adjacent land cells, and if we have not visited them yet, we add them to the queue. After checking all of its adjacent land cells, we mark the cell as visited and process the next element in the queue. We repeat this until we reach a cell that is in the bottom row of matrix. At this point, we have found a path from the top row to the bottom row. Another possible terminating condition for the BFS is that we are unable to find a path from any of the cells in the top row that reaches a cell in the bottom row.

    Overall, the binary search takes log(mn)log(m \cdot n), and the BFS takes O(mn)O(m \cdot n) time, so the time complexity of this solution is O((mn)log(mn))O((m \cdot n) \cdot log(m \cdot n)). The space complexity of this solution is O(mn)O(m \cdot n).

  • Depth-first search (DFS) with binary search:

    This solution is similar to the one we just discussed, except it uses DFS instead of BFS. On each day, for each cell in the top row of matrix, we explore its unvisited neighboring land cells recursively. We continue doing this until we reach the bottom row, or until all reachable land cells in matrix have been visited. To avoid revisiting the same cell, we just mark the cell as visited.

    The time and space complexity of this solution is the same as that of the BFS with binary search.

Last Day Where You Can Still Cross

Minimize Malware Spread