Teaching Kids Programming – Minimum Genetic Mutation via Breadth First Search Algorithm


Teaching Kids Programming: Videos on Data Structures and Algorithms

A gene string can be represented by an 8-character long string, with choices from ‘A’, ‘C’, ‘G’, and ‘T’. Suppose we need to investigate a mutation from a gene string startGene to a gene string endGene where one mutation is defined as one single character changed in the gene string. For example, “AACCGGTT” –> “AACCGGTA” is one mutation. There is also a gene bank bank that records all the valid gene mutations. A gene must be in bank to make it a valid gene string. Given the two gene strings startGene and endGene and the gene bank bank, return the minimum number of mutations needed to mutate from startGene to endGene. If there is no such a mutation, return -1.

Note that the starting point is assumed to be valid, so it might not be included in the bank.

Example 1:
Input: startGene = “AACCGGTT”, endGene = “AACCGGTA”, bank = [“AACCGGTA”]
Output: 1

Example 2:
Input: startGene = “AACCGGTT”, endGene = “AAACGGTA”, bank = [“AACCGGTA”,”AACCGCTA”,”AAACGGTA”]
Output: 2

Constraints:
0 <= bank.length <= 10
startGene.length == endGene.length == bank[i].length == 8
startGene, endGene, and bank[i] consist of only the characters [‘A’, ‘C’, ‘G’, ‘T’].

Minimum Genetic Mutation via Breadth First Search Algorithm

This problem can be visualized as a Graph problem. It is about finding the shortest path in a undirected unweighted graph. And the mutations are the edges. The valid mutations/edges are given in a list which we convert it to set for a O(1) faster lookup.

We use a double-ended queue aka deque to implement a BFS (Breadth First Search Algorithm) where we keep tracking the mutation and the distance/steps (in a tuple). If the size of a gene string is n, and there are 3*n mutations. We can replace each gene character to 3 others. We also need to keep visited gene nodes in a hash table in order to avoid being stuck in a cycle.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class Solution:
    def minMutation(self, start: str, end: str, bank: List[str]) -> int:
        bank = set(bank)
        queue = deque([(start, 0)])
        seen = set({start})
        
        while queue:
            node, steps = queue.popleft()
            if node == end:
                return steps
 
            for i in range(len(node)):
                for c in "ACGT":                
                    neighbor = node[:i] + c + node[i + 1:]
                    if neighbor not in seen and neighbor in bank and neighbor != node:
                        queue.append((neighbor, steps + 1))
                        seen.add(neighbor)
        return -1
class Solution:
    def minMutation(self, start: str, end: str, bank: List[str]) -> int:
        bank = set(bank)
        queue = deque([(start, 0)])
        seen = set({start})
        
        while queue:
            node, steps = queue.popleft()
            if node == end:
                return steps

            for i in range(len(node)):
                for c in "ACGT":                
                    neighbor = node[:i] + c + node[i + 1:]
                    if neighbor not in seen and neighbor in bank and neighbor != node:
                        queue.append((neighbor, steps + 1))
                        seen.add(neighbor)
        return -1

The time/space complexity is O(B) as only Gene nodes in the given set (bank) are expanded.

Minimal Gene Mutation

–EOF (The Ultimate Computing & Technology Blog) —

GD Star Rating
loading...
733 words
Last Post: BASH Script to Query and Monitor the Crypto Prices (Exchange Rate to Fiat)
Next Post: Teaching Kids Programming - Minimum Genetic Mutation via Recursive Depth First Search Algorithm

The Permanent URL is: Teaching Kids Programming – Minimum Genetic Mutation via Breadth First Search Algorithm

Leave a Reply