Solution: Accounts Merge

Let's solve the Accounts Merge problem using the Union Find pattern.

Statement#

You are given a 2D array, accounts, where each row, accounts[i], is an array of strings, such that the first element, accounts[i][0], is a name, while the remaining elements are emails associated with that account. Your task is to determine if two accounts belong to the same person by checking if both accounts have the same name and at least one common email address.

If two accounts have the same name, they might belong to different people since people can have the same name. However, all accounts that belong to one person will have the same name. This implies that a single person can hold multiple accounts.

The output should be a 2D array in which the first element of each row is the name, and the rest of the elements are the merged list of that user’s email addresses in sorted order. There should be one row for each distinct user, and for each user, each email address should be listed only once.

Note: Please use a sort function that sorts the email addresses based on the ASCII value of each character.

Constraints:

  • 1<=1 <= accounts.length <=1000<= 1000

  • 2<=2 <= accounts[i].length <=10<= 10

  • 1<=1 <= accounts[i][j].length <=30<= 30

  • Because accounts[i][0] is the name of any person, it should contain only English letters.

  • For j>0j>0, accounts[i][j] should be a valid email.

Solution#

Since the problem involves merging the accounts that have common emails among them, we'll use the union find pattern to solve it. The union find pattern is used to group elements into sets based on a specified property. Each set forms a tree data structure and has a representative element that resides at the root of the tree. The pattern uses a disjoint set data structure, such as an array, to keep track of the membership of each element in the set.

To use the union find pattern, we create and use the UnionFind class, which has two primary methods:

  • find(node): Returns the set’s main representative, the root parent, of a given node.

  • union(node1, node2): Merges the sets of node1 and node2 into one.

First, we assign a unique ID (an integer value) to each account, making them the representative of their sets. For example, if we have the accounts = [ [Emma, e@mail.com], [Bob, b@mail.com], [Emma, e2@mail.com, e@mail.com] ][ ~[Emma, ~e@mail.com], ~[Bob, ~b@mail.com], ~[Emma, ~e2@mail.com, ~e@mail.com] ~], we assign them the IDs 00, 11, and 22. We create an array to store these IDs, that is, parents = [0, 1, 2][0,~1,~2].

Next, we iterate over the given accounts, and for each account, we iterate over its associated emails and map them to one of the parents. If an email is seen for the first time, it gets the ID of its current parent. However, if the email is being seen for the second time, that is, if two accounts share the same email, we take the union of this email’s previously assigned ID and its current ID to make a connection between them. For example, for accounts = [ [Emma, e@mail.com], [Bob, b@mail.com], [Emma, e2@mail.com, e@mail.com] ][ ~[Emma, ~e@mail.com], ~[Bob, ~b@mail.com], ~[Emma, ~e2@mail.com, ~e@mail.com] ~], we assign e@mail.come@mail.com the ID of its current parent, EmmaEmma (that is, 00) when we encounter it for the first time. When we see it for the second time, we take the union, such as union(0,2), where 00 is its previously assigned ID, and 22 is its current ID. Here, we connect the two accounts because they share the same email. This will eventually change the ID of the second EmmaEmma account to the first EmmaEmma account such that now, parents = [0, 1, 0][0,~1,~0], representing that both accounts belong to the same person.

After the mapping above, we merge all the accounts sharing emails into single accounts. To achieve this, we use find(node) for each email to check who is the root parent of the given node. Finally, we go over each account and sort the emails within that account. According to our example, the final merged accounts will look like this: [ [Emma, e@mail.com, e2@mail.com], [Bob, b@mail.com] ][ ~[Emma, ~e@mail.com, ~e2@mail.com], ~[Bob, ~b@mail.com] ~].

Let’s look at the following illustration to get a better understanding of the solution:

Created with Fabric.js 3.6.6 Emma e2@mail.com e1@mail.com Bob b7@mail.com b0@mail.com Emma e1@mail.com e8@mail.com Bob b3@mail.com Accounts The task is to merge the accounts that share any email(s).

1 of 15

Created with Fabric.js 3.6.6 Emma : 0 Bob : 1 Emma : 2 Bob : 3 We assign a unique ID to each account to make them the representatives (root parents) of their sets.Now, we'll iterate over each account's emails and put them in their relevant sets.

2 of 15

Created with Fabric.js 3.6.6 Bob : 1 Emma : 2 Bob : 3 Emma : 0 e2@mail.com First account's emails = [ e2@mail.com, e1@mail.com ]e2@mail.com belongs to the first account, so add it to that set (having the ID 0).

3 of 15

Created with Fabric.js 3.6.6 Emma : 2 Bob : 3 Emma : 0 e2@mail.com e1@mail.com Bob : 1 First account's emails = [ e2@mail.com, e1@mail.com ]e1@mail.com also belongs to the first account, so add it to that set (having the ID 0).

4 of 15

Created with Fabric.js 3.6.6 Emma : 2 Bob : 3 Emma : 0 e2@mail.com e1@mail.com Bob : 1 b7@mail.com b0@mail.com Second account's emails = [ b7@mail.com, b0@mail.com ]b7@mail.com and b0@mail.com belong to the second account, so add them to that set (having the ID 1).

5 of 15

Created with Fabric.js 3.6.6 Bob : 3 Emma : 0 e2@mail.com e1@mail.com Bob : 1 b7@mail.com b0@mail.com union(0, 2) Emma : 2 Third account's emails = [ e1@mail.com, e8@mail.com ]e1@mail.com is already in the first account's set. This implies that this account and the first account belong to the same person because they share the same email. Therefore, we connect these two accounts by making 0 (the ID of first account) the root parent of this account.

6 of 15

Created with Fabric.js 3.6.6 Bob : 3 Emma : 0 e2@mail.com e1@mail.com Bob : 1 b7@mail.com b0@mail.com Emma : 0 Note that the main representative of both the accounts (the first and the third) has been set to 0.

7 of 15

Created with Fabric.js 3.6.6 Bob : 3 Emma : 0 e2@mail.com e1@mail.com Bob : 1 b7@mail.com b0@mail.com Emma : 0 e8@mail.com Third account's emails = [ e1@mail.com, e8@mail.com ]e8@mail.com belongs to the third account's set, and it is not already present in the first or any other account's set, so add it to the current account's set.

8 of 15

Created with Fabric.js 3.6.6 Emma : 0 e2@mail.com e1@mail.com Bob : 1 b7@mail.com b0@mail.com Bob : 3 b3@mail.com Emma : 0 e8@mail.com Fourth account's emails = [ b3@mail.com ]b3@mail.com belongs to the fourth account's set, so add it to that set.

9 of 15

Created with Fabric.js 3.6.6 Emma : 0 e2@mail.com e1@mail.com Bob : 1 b7@mail.com b0@mail.com Bob : 3 b3@mail.com Emma : 0 e8@mail.com Now, we merge the accounts that have the same parent ID, i.e., the same root parent. We look for the root parent of each set using the find method, and merge the accounts accordingly.

10 of 15

Created with Fabric.js 3.6.6 Emma : 0 e2@mail.com e1@mail.com Bob : 1 b7@mail.com b0@mail.com Bob : 3 b3@mail.com Emma : 0 e8@mail.com We find the root parents of e2@mail.com and e1@mail.com. It is 0 for boththe email addresses because their root parent is 0, so they remain in the same set.

11 of 15

Created with Fabric.js 3.6.6 Emma : 0 e2@mail.com e1@mail.com Bob : 1 b7@mail.com b0@mail.com Bob : 3 b3@mail.com Emma : 0 e8@mail.com Similarly, we find the root parents of b7@mail.com and b0@mail.com. It is 1 for both, so they remain in the same set.

12 of 15

Created with Fabric.js 3.6.6 Emma : 0 e2@mail.com e1@mail.com Bob : 1 b7@mail.com b0@mail.com Bob : 3 b3@mail.com Emma : 0 e8@mail.com Now, we find the root parent of e8@mail.com. It is 0 since its main representative is 0, so we merge it with the first account's set.

13 of 15

Created with Fabric.js 3.6.6 Bob : 1 b7@mail.com b0@mail.com Bob : 3 b3@mail.com Emma : 0 e2@mail.com e1@mail.com e8@mail.com Note that the first and the third accounts have been merged.Next, we find the root parent of b3@mail.com, which is 3, so it remains in the same set.

14 of 15

Created with Fabric.js 3.6.6 Bob : 3 b3@mail.com Emma : 0 e1@mail.com e2@mail.com e8@mail.com Bob : 1 b0@mail.com b7@mail.com The final step is to sort the emails in each account and then return these merged accounts with the sorted emails as the output.

15 of 15

Let’s have a look at the code for the algorithm we just discussed.

main.py
union_find.py
Accounts Merge

Time complexity#

Finding the root parents of all the emails to merge them will take O(nk.α(n))O(nk.α(n)), where αα is the inverse Ackermann function that grows very slowly, nn is the total number of accounts, and kk is the maximum number of elements in any account. Sorting all the emails in each account takes O(nk.log(nk))O(nk.log(nk)). Therefore, the overall time complexity for this solution is O(nk.α(n))+O(nk.log(nk))O(nk.α(n)) + O(nk.log(nk)).

Space complexity#

The space complexity for this solution is O(nk)O(nk).

Accounts Merge

Final Remarks