Teaching Kids Programming: Videos on Data Structures and Algorithms
You are given a list of unique strings genes where each element has the same length and contains characters “A”, “C”, “G” and/or “T”.
If strings a and b are the same string except for one character, then a and b are in the same mutation group.
If strings a and b are in a group and b and c are in a group, then a and c are in the same group.
Return the total number of mutation groups.Constraints
n ≤ 10,000
k ≤ 20 where k is the length of a string in genes
Example 1
Input
genes = [“ACGT”, “ACCT”, “AGGT”, “TTTT”, “TTTG”]
Output
2
Explanation
There are two mutation groups:
[“ACGT”, “ACCT”, “AGGT”]
[“TTTT”, “TTTG”]Hint:
How can we connect one word with another? Is there a better way to connect than simply comparing each pair? After connecting words with each other, return the groups they form.
Human has 23 chromosomes (one of them controls the gender, where XX means girls and XY means boys). Each chromosome has hundreds of genes.
Gene Mutation Groups via Recursive Depth First Search Algorithm
We can treat each Gene Group as a vertex, and the edges are connected between two groups if they differ only 1 Gene. We can use a hash set to mark the nodes that we have seen. And then we can start Depth first search on each vertex that is not seen/visited.
The counter is incremented when we start a new Depth first search (DFS) on a vertex. The answer will be the number of components for the Graph.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | class Solution: def geneMutationGroup(self, genes): seen = set() dna = set(genes) def dfs(a): seen.add(a) for i in range(len(a)): for x in ('A', 'G', 'T', 'C'): c = a[:i] + x + a[i + 1:] if c not in seen and c in dna: dfs(c) ans = 0 for i in genes: if i not in seen: ans += 1 dfs(i) return ans |
class Solution: def geneMutationGroup(self, genes): seen = set() dna = set(genes) def dfs(a): seen.add(a) for i in range(len(a)): for x in ('A', 'G', 'T', 'C'): c = a[:i] + x + a[i + 1:] if c not in seen and c in dna: dfs(c) ans = 0 for i in genes: if i not in seen: ans += 1 dfs(i) return ans
Another alternative implementation of Depth First Search algorithm would be to have all vertices in a hash set, and then remove it as we follow the DFS process. The time complexity is O(N*K*K) where N is the number of Genes, and K is the longest Gene sequence. The space complexity is O(NK)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | class Solution: def geneMutationGroup(self, genes): left = set(genes) def dfs(a): left.remove(a) for i in range(len(a)): for x in ('A', 'G', 'T', 'C'): c = a[:i] + x + a[i + 1:] if c in left: dfs(c) ans = 0 for i in genes: if i in left: ans += 1 dfs(i) return ans |
class Solution: def geneMutationGroup(self, genes): left = set(genes) def dfs(a): left.remove(a) for i in range(len(a)): for x in ('A', 'G', 'T', 'C'): c = a[:i] + x + a[i + 1:] if c in left: dfs(c) ans = 0 for i in genes: if i in left: ans += 1 dfs(i) return ans
Note, this problem can also be solved via Union Find + Disjoint Set.
See also – traversing a Graph can also be done in Breadth First Search Algorithm: Teaching Kids Programming – Find Gene Mutation Groups via Breadth First Search Algorithm
Find Gene Mutation Groups
- Teaching Kids Programming – Find Gene Mutation Groups via UnionFind Algorithm (Disjoint Set)
- Teaching Kids Programming – Find Gene Mutation Groups via Breadth First Search Algorithm
- Teaching Kids Programming – Find Gene Mutation Groups via Recursive Depth First Search Algorithm
–EOF (The Ultimate Computing & Technology Blog) —
loading...
Last Post: Teaching Kids Programming - Algorithm to Truncate Sentence via Split Function
Next Post: Teaching Kids Programming - Find Gene Mutation Groups via Breadth First Search Algorithm