Watch 10 video solutions for Minimum Genetic Mutation, a medium level problem involving Hash Table, String, Breadth-First Search. This walkthrough by codestorywithMIK has 70,886 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.
A gene string can be represented by an 8-character long string, with choices from 'A', 'C', 'G', and 'T'.
Suppose we need to investigate a mutation from a gene string startGene to a gene string endGene where one mutation is defined as one single character changed in the gene string.
"AACCGGTT" --> "AACCGGTA" is one mutation.There is also a gene bank bank that records all the valid gene mutations. A gene must be in bank to make it a valid gene string.
Given the two gene strings startGene and endGene and the gene bank bank, return the minimum number of mutations needed to mutate from startGene to endGene. If there is no such a mutation, return -1.
Note that the starting point is assumed to be valid, so it might not be included in the bank.
Example 1:
Input: startGene = "AACCGGTT", endGene = "AACCGGTA", bank = ["AACCGGTA"] Output: 1
Example 2:
Input: startGene = "AACCGGTT", endGene = "AAACGGTA", bank = ["AACCGGTA","AACCGCTA","AAACGGTA"] Output: 2
Constraints:
0 <= bank.length <= 10startGene.length == endGene.length == bank[i].length == 8startGene, endGene, and bank[i] consist of only the characters ['A', 'C', 'G', 'T'].Problem Overview: You start with a gene string and need to transform it into a target gene. Each mutation changes exactly one character, and every intermediate gene must exist in the given bank. The task is to compute the minimum number of mutations required, or return -1 if the transformation is impossible.
Approach 1: Breadth-First Search (BFS) (Time: O(N × L × 4), Space: O(N))
Treat each gene as a node in an implicit graph. Two genes are connected if they differ by exactly one character. Start a BFS from the startGene and explore all valid mutations level by level. For each position in the gene string, try replacing it with the four possible characters (A, C, G, T). If the new gene exists in the bank and has not been visited, push it into the queue. Because Breadth-First Search explores by distance, the first time you reach endGene gives the minimum mutation count.
Using a hash table or set for the bank enables O(1) membership checks while generating mutations. The gene length L is fixed (typically 8), so each state generates at most L × 4 candidates. BFS works well here because the problem asks for the shortest transformation path in an unweighted graph.
Approach 2: Bidirectional BFS (Time: O(N × L × 4), Space: O(N))
Bidirectional BFS improves practical performance by searching from both the start and end genes simultaneously. Maintain two frontier sets: one expanding from startGene and another from endGene. At each step, expand the smaller frontier by generating all one-character mutations and checking whether any mutation appears in the opposite frontier.
This approach reduces the search depth roughly by half, which significantly cuts the number of explored states when the bank is large. A hash set stores the valid genes, and visited genes are removed to avoid revisiting. The mutation generation logic is the same as in the standard BFS, still operating on the string representation of genes.
Recommended for interviews: The standard BFS solution is what most interviewers expect first because it directly models the problem as a shortest-path search in an unweighted graph. Implementing BFS with a queue and hash set demonstrates solid graph traversal fundamentals. After presenting it, mentioning Bidirectional BFS shows deeper optimization awareness and stronger problem-solving maturity.
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Breadth-First Search (BFS) | O(N × L × 4) | O(N) | General solution. Best when modeling the problem as shortest path in an unweighted graph. |
| Bidirectional BFS | O(N × L × 4) | O(N) | Prefer when the gene bank is large and you want to reduce search depth by expanding from both start and end. |