Minimum Genetic Mutation - Solution & Explanation

Q: Is Minimum Genetic Mutation easy or hard?

Minimum Genetic Mutation is rated Medium difficulty. The main challenge is recognizing that the problem forms an implicit graph where each valid mutation is an edge. Once modeled correctly, the BFS implementation is straightforward.

Q: How to solve Minimum Genetic Mutation in O(N)?

An optimized BFS with a hash set for the gene bank effectively runs in near O(N × L) time since L is a small constant. Each valid gene is processed once, and mutation candidates are generated by changing one character at a time. Using a set allows O(1) lookup when verifying whether a mutation exists in the bank.

Q: What is the best approach for Minimum Genetic Mutation?

Breadth-First Search (BFS) is the best approach because the problem asks for the minimum number of mutations, which maps directly to the shortest path in an unweighted graph. Each gene represents a node and each valid one-character mutation forms an edge. BFS guarantees the first time you reach the target gene is the minimum number of steps.

Q: What data structure is used in Minimum Genetic Mutation?

The core data structures are a queue for BFS traversal and a hash set for storing the gene bank and visited genes. The queue processes genes level by level, while the hash set allows constant-time checks to validate mutations and prevent revisiting states.

Q: What is the time complexity of Minimum Genetic Mutation?

The typical BFS solution runs in O(N × L × 4) time, where N is the number of genes in the bank and L is the gene length (usually 8). For every gene explored, you attempt L positions and 4 possible characters. Space complexity is O(N) for the queue and visited set.

Q: Minimum Genetic Mutation Python or Java solution approach?

Both Python and Java implementations typically use BFS with a queue and a hash set. Generate mutations by iterating over each position in the gene string and replacing it with A, C, G, or T. If the mutated gene exists in the bank and hasn't been visited, add it to the queue for the next BFS level.

Q: Is Minimum Genetic Mutation asked at Google, Amazon, or Meta?

Minimum Genetic Mutation is a common interview-style graph problem similar to Word Ladder. Variants that use BFS on strings or transformation graphs have appeared in interviews at companies like Amazon, Google, and Meta. Interviewers use it to test graph traversal and state-space search.

MediumHash Table String Breadth-First Search22 min readAsked at: Amazon, Microsoft, Meta +3

Practice this problem

Problem Statement

A gene string can be represented by an 8-character long string, with choices from 'A', 'C', 'G', and 'T'.

Suppose we need to investigate a mutation from a gene string startGene to a gene string endGene where one mutation is defined as one single character changed in the gene string.

For example, "AACCGGTT" --> "AACCGGTA" is one mutation.

There is also a gene bank bank that records all the valid gene mutations. A gene must be in bank to make it a valid gene string.

Given the two gene strings startGene and endGene and the gene bank bank, return the minimum number of mutations needed to mutate from startGene to endGene. If there is no such a mutation, return -1.

Note that the starting point is assumed to be valid, so it might not be included in the bank.

Example 1:

Input: startGene = "AACCGGTT", endGene = "AACCGGTA", bank = ["AACCGGTA"]
Output: 1

Example 2:

Input: startGene = "AACCGGTT", endGene = "AAACGGTA", bank = ["AACCGGTA","AACCGCTA","AAACGGTA"]
Output: 2

Constraints:

0 <= bank.length <= 10
startGene.length == endGene.length == bank[i].length == 8
startGene, endGene, and bank[i] consist of only the characters ['A', 'C', 'G', 'T'].

Approach Overview

Problem Overview: You start with a gene string and need to transform it into a target gene. Each mutation changes exactly one character, and every intermediate gene must exist in the given bank. The task is to compute the minimum number of mutations required, or return -1 if the transformation is impossible.

Approach 1: Breadth-First Search (BFS) (Time: O(N × L × 4), Space: O(N))

Treat each gene as a node in an implicit graph. Two genes are connected if they differ by exactly one character. Start a BFS from the startGene and explore all valid mutations level by level. For each position in the gene string, try replacing it with the four possible characters (A, C, G, T). If the new gene exists in the bank and has not been visited, push it into the queue. Because Breadth-First Search explores by distance, the first time you reach endGene gives the minimum mutation count.

Using a hash table or set for the bank enables O(1) membership checks while generating mutations. The gene length L is fixed (typically 8), so each state generates at most L × 4 candidates. BFS works well here because the problem asks for the shortest transformation path in an unweighted graph.

Approach 2: Bidirectional BFS (Time: O(N × L × 4), Space: O(N))

Bidirectional BFS improves practical performance by searching from both the start and end genes simultaneously. Maintain two frontier sets: one expanding from startGene and another from endGene. At each step, expand the smaller frontier by generating all one-character mutations and checking whether any mutation appears in the opposite frontier.

This approach reduces the search depth roughly by half, which significantly cuts the number of explored states when the bank is large. A hash set stores the valid genes, and visited genes are removed to avoid revisiting. The mutation generation logic is the same as in the standard BFS, still operating on the string representation of genes.

Recommended for interviews: The standard BFS solution is what most interviewers expect first because it directly models the problem as a shortest-path search in an unweighted graph. Implementing BFS with a queue and hash set demonstrates solid graph traversal fundamentals. After presenting it, mentioning Bidirectional BFS shows deeper optimization awareness and stronger problem-solving maturity.

Approach 1: Breadth-First Search (BFS) Approach

In this approach, we utilize the Breadth-First Search algorithm to explore the mutation process. The problem can be transformed into finding the shortest path in an unweighted graph (where nodes are gene strings and edges represent valid transformations as per the gene bank).

Steps involved:

Use a queue to perform the BFS starting from the startGene.
Use a set to store visited nodes to prevent re-processing.
The queue will store pairs representing the current gene string and the number of mutations so far.
Iteratively explore valid mutations (one letter changes) that are in the gene bank.
If the endGene is reached, return the mutation count.
If the queue is exhausted without finding the endGene, return -1.

This C program uses a queue to perform a BFS search. It starts from the startGene and attempts to form every possible valid mutation step by step using the genes in the bank. A visited array prevents cycles.

Code

C C++Java Python C#JavaScript

C++

Java

Python

JavaScript

Complexity

Time Complexity: O(N * M * 4) where N is the number of steps in bank, M is length of startGene (8), and 4 is from trying each nucleotide.

Space Complexity: O(N) for the queue.

Try this approach in the editor →

Approach 2: Bidirectional BFS Approach

This approach aims to optimize the BFS by conducting simultaneous searches from both the start and end genes. This reduces the search space effectively.

Steps:

Initialize forward and backward sets with the startGene and endGene, respectively.
Use a set for the gene bank to ensure quick look-up and process unvisited genes.
Iteratively expand the forward and backward sets, swapping as necessary to always expand the smaller set.
If a common gene is found in both sets, return the combined step count.
If expansion is exhausted without finding an intersection, return -1.

This C++ solution implements bidirectional BFS to find the shortest mutation path by expanding the smaller frontier set of genes in each step, improving efficiency.

Code

C++Python

C++

Python

Complexity

Time Complexity: O(N * M * 4), with quicker terminations due to bidirectional checks.

Space Complexity: O(N) for two sets tracking forward and backward layers.

Try this approach in the editor →

Approach 3: BFS

We define a queue q to store the current gene sequence and the number of changes, and a set vis to store the visited gene sequences. Initially, we add the starting gene sequence start to the queue q and the set vis.

Then, we continuously take out a gene sequence from the queue q. If this gene sequence equals the target gene sequence, we return the current number of changes. Otherwise, we iterate through the gene bank bank, calculate the difference value between the current gene sequence and the gene sequence in the gene bank. If the difference value is 1 and the gene sequence in the gene bank has not been visited, we add it to the queue q and the set vis.

If the queue q is empty, it means that the gene change cannot be completed, so we return -1.

The time complexity is O(C times n times m), and the space complexity is O(n times m). Where n and m are the lengths of the gene sequence and the gene bank respectively, and C is the size of the character set of the gene sequence. In this problem, C = 4.

Code

Python Java C++Go TypeScript Rust

Python

Java

C++

TypeScript

Rust

Try this approach in the editor →

Complexity Comparison

Approach	Complexity
Breadth-First Search (BFS) Approach	Time Complexity: O(N * M * 4) where N is the number of steps in bank, M is length of startGene (8), and 4 is from trying each nucleotide. Space Complexity: O(N) for the queue.
Bidirectional BFS Approach	Time Complexity: O(N * M * 4), with quicker terminations due to bidirectional checks. Space Complexity: O(N) for two sets tracking forward and backward layers.
BFS	—

Detailed Complexity Analysis

Approach	Time	Space	When to Use
Breadth-First Search (BFS)	O(N × L × 4)	O(N)	General solution. Best when modeling the problem as shortest path in an unweighted graph.
Bidirectional BFS	O(N × L × 4)	O(N)	Prefer when the gene bank is large and you want to reduce search depth by expanding from both start and end.

Video Solution

Minimum Genetic Mutation -(Asked by Twitter) : Explanation ➕ Live Coding • codestorywithMIK • 70,886 views views

Watch 9 more video solutions →

Frequently Asked Questions

Is Minimum Genetic Mutation easy or hard?

Minimum Genetic Mutation is rated Medium difficulty. The main challenge is recognizing that the problem forms an implicit graph where each valid mutation is an edge. Once modeled correctly, the BFS implementation is straightforward.

How to solve Minimum Genetic Mutation in O(N)?

An optimized BFS with a hash set for the gene bank effectively runs in near O(N × L) time since L is a small constant. Each valid gene is processed once, and mutation candidates are generated by changing one character at a time. Using a set allows O(1) lookup when verifying whether a mutation exists in the bank.

What is the best approach for Minimum Genetic Mutation?

Breadth-First Search (BFS) is the best approach because the problem asks for the minimum number of mutations, which maps directly to the shortest path in an unweighted graph. Each gene represents a node and each valid one-character mutation forms an edge. BFS guarantees the first time you reach the target gene is the minimum number of steps.

What data structure is used in Minimum Genetic Mutation?

The core data structures are a queue for BFS traversal and a hash set for storing the gene bank and visited genes. The queue processes genes level by level, while the hash set allows constant-time checks to validate mutations and prevent revisiting states.

What is the time complexity of Minimum Genetic Mutation?

The typical BFS solution runs in O(N × L × 4) time, where N is the number of genes in the bank and L is the gene length (usually 8). For every gene explored, you attempt L positions and 4 possible characters. Space complexity is O(N) for the queue and visited set.

Minimum Genetic Mutation Python or Java solution approach?

Both Python and Java implementations typically use BFS with a queue and a hash set. Generate mutations by iterating over each position in the gene string and replacing it with A, C, G, or T. If the mutated gene exists in the bank and hasn't been visited, add it to the queue for the next BFS level.

Is Minimum Genetic Mutation asked at Google, Amazon, or Meta?

Minimum Genetic Mutation is a common interview-style graph problem similar to Word Ladder. Variants that use BFS on strings or transformation graphs have appeared in interviews at companies like Amazon, Google, and Meta. Interviewers use it to test graph traversal and state-space search.

Ready to solve this problem?

Practice Minimum Genetic Mutation with our built-in code editor and test cases.

Practice on FleetCode

Word Ladder

Problem Info

DifficultyMedium

Acceptance56.5%

Approaches3

Reading time22 min

Asked at

Amazon Microsoft Meta Google Bloomberg

Practice this problem

Open in Editor

Minimum Genetic Mutation - Solution & Explanation

Problem Statement

Approach Overview

Approach 1: Breadth-First Search (BFS) Approach

Code

Complexity

Approach 2: Bidirectional BFS Approach

Code

Complexity

Approach 3: BFS

Code

Complexity Comparison

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Ready to solve this problem?

Problem Info

Table of Contents

Minimum Genetic Mutation - Solution & Explanation

Problem Statement

Approach Overview

Approach 1: Breadth-First Search (BFS) Approach

Code

Complexity

Approach 2: Bidirectional BFS Approach

Code

Complexity

Approach 3: BFS

Code

Complexity Comparison

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Ready to solve this problem?

Problem Info

Table of Contents

Problem Statement

Approach Overview

Approach 1: Breadth-First Search (BFS) Approach

Code

Complexity

Approach 2: Bidirectional BFS Approach

Code

Complexity

Approach 3: BFS

Code

Complexity Comparison

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Related Problems

Ready to solve this problem?

Problem Info

Table of Contents

Problem Statement

Approach Overview

Approach 1: Breadth-First Search (BFS) Approach

Code

Complexity

Approach 2: Bidirectional BFS Approach

Code

Complexity

Approach 3: BFS

Code

Complexity Comparison

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Related Problems

Ready to solve this problem?

Problem Info

Table of Contents