Using trie solving a real world algorithm problem
📅2023-07-12🧐92
The problem
I saw a message in a tech group for 30 minutes ago:
Here is the inputs and the expected output:
full name list:
[
{
"name": "John Doe"
},
{
"name": "Jane Frances Dough"
}
]
janky list
[
{
"name": "doe@email.com",
"ID": 1
},
{
"name": "J Dough"
"ID": 2
},
{
"name": "Bob Barker"
"ID": 3
},
{
"name": "Barker"
"ID": 4
}
]
expected output:
[
{
"name": "doe@email.com",
"ID": 1
},
{
"name": "J Dough"
"ID": 2
}
]
Solution
Trie is a good apporach in this case, we can use it to reduce the time complexity, and to use a least space as small as possible.
Here is how a Trie works:
https://en.wikipedia.org/wiki/Trie
The time complexity in this case is O(n) which n is the length of the janky list. And all the ideas are in the comment.
def search(arr1, arr2):
initials = {} # a trie to save all possible combinations of all names
"""
like this:
"j": {
"jane": { # her full first name
"dough": {} # her full last name
}
"dough": {} # her full last name
}
two type of search can match the dict above:
a["j"]["jane"]["dough"] # True
a["j"]["dough"] # True
if you like, also can save middle name as well
"""
ln_emails = set() # last name as email username
# gathered
for item in arr2:
names = item["name"].split()
f_init = names[0][0].lower()
f_full = names[0].lower()
l_full = names[-1].lower()
# add to initials trie
initials[f_init] = {
f_full: {
l_full:{}
},
l_full: {}
}
# add to email set
ln_emails.add(l_full)
result = []
# janky
for item in arr1:
# Case 1: email , query time O(1)
email_name_right = item["name"].find("@")
if email_name_right >= 0:
if item["name"][:email_name_right] in ln_emails:
result.append(item)
continue
# Case 2: not an email, query time O(1)
names = item["name"].split()
f_init = names[0][0].lower()
f_full = names[0].lower()
l_full = names[-1].lower()
if f_init in initials:
found = False
if f_full in initials[f_init]:
if l_full in initials[f_init][f_full] or not len(initials[f_init][f_full]):
result.append(item)
found = True
if not found and l_full in initials[f_init]:
result.append(item)
return result
Test
janky = [
{
"name": "doe@email.com",
"ID": 1
},
{
"name": "J Dough",
"ID": 2
},
{
"name": "Bob Barker",
"ID": 3
},
{
"name": "Barker",
"ID": 4
}
]
gathered1 = [
{
"name": "John Doe"
},
{
"name": "Jane Frances Dough"
}
]
gathered2 = [
{
"name": "Bob Barker"
},
{
"name": "Barker"
}
]
print("test1", search(janky, gathered1))
print("test2", search(janky, gathered2))
"""
output:
Finished in 58 ms
test1 [{'name': 'doe@email.com', 'ID': 1}, {'name': 'J Dough', 'ID': 2}]
test2 [{'name': 'Bob Barker', 'ID': 3}, {'name': 'Barker', 'ID': 4}]
"""