Two strings, X and Y, are considered similar if either they are identical or we can make them equivalent by swapping at most two letters (in distinct positions) within the string X.
For example, "tars" and "rats" are similar (swapping at positions 0 and 2), and "rats" and "arts" are similar, but "star" is not similar to "tars", "rats", or "arts".
Together, these form two connected groups by similarity: {"tars", "rats", "arts"} and {"star"}. Notice that "tars" and "arts" are in the same group even though they are not similar. Formally, each group is such that a word is in the group if and only if it is similar to at least one other word in the group.
We are given a list strs of strings where every string in strs is an anagram of every other string in strs. How many groups are there?
Example 1:
Input: strs = ["tars","rats","arts","star"] Output: 2
Example 2:
Input: strs = ["omv","ovm"] Output: 1
Constraints:
1 <= strs.length <= 3001 <= strs[i].length <= 300strs[i] consists of lowercase letters only.strs have the same length and are anagrams of each other.Problem Overview: You are given an array of strings where every string is an anagram of the others. Two strings are considered similar if you can swap exactly two characters in one string to make it equal to the other. The task is to count how many groups of similar strings exist.
The key observation: similarity is a transitive relationship. If a is similar to b, and b is similar to c, then all belong to the same group even if a and c are not directly similar. This naturally forms connected components in a graph.
Approach 1: Union-Find (Disjoint Set) (O(n2 * m) time, O(n) space)
Treat each string as a node. Compare every pair of strings and check whether they differ in exactly two positions (or zero). If they are similar, merge their sets using a union-find structure. The similarity check scans characters and records mismatched positions. If the mismatch count exceeds two, stop early. After processing all pairs, the number of unique parents in the disjoint set represents the number of groups. This approach works well because union-find efficiently merges connected components with near-constant amortized operations.
Approach 2: Depth-First Search Graph Traversal (O(n2 * m) time, O(n) space)
Model the problem as an undirected graph where an edge connects two strings if they are similar. Iterate through all string pairs and check similarity. Then run DFS from each unvisited node to mark all reachable nodes in that component. Each DFS traversal corresponds to one similar string group. The similarity check still costs O(m) per pair, so the total complexity becomes O(n^2 * m). This method is conceptually simple and easy to implement if you're comfortable with graph traversal.
Both approaches rely on pairwise comparisons because similarity depends on character positions. The array size n determines how many comparisons happen, while string length m determines the cost of each similarity check.
Recommended for interviews: The Union-Find approach is usually preferred. Interviewers expect you to recognize the connected-components pattern and apply a disjoint-set structure. Explaining the graph interpretation first shows clear reasoning, while implementing union-find demonstrates familiarity with a standard connectivity algorithm used across many array and graph problems.
In this approach, we treat each string as a node in a graph. Two nodes are connected if the corresponding strings are similar. We use a Union-Find data structure to efficiently group the strings into connected components based on similarity.
This solution uses the Union-Find data structure to keep track of connected string groups. The areSimilar() function calculates string similarity by counting character mismatches. If a mismatch is found more than twice, the strings are not similar. The main function initializes the parents and processes connections through the union-find operations to count the disjoint sets.
Time Complexity: O(n^2 * k) where n is the number of strings and k is the average length of the strings.
Space Complexity: O(n), storing the parent array.
This approach considers each string as a node and treats detecting similarities like exploring components in a graph. We use a depth-first search (DFS) algorithm to explore nodes. If two strings are similar, we explore all strings similar to this pair within a recursive call.
This C solution employs DFS to exhaustively search connected components formed by similar strings. We use a boolean array to track visited nodes and explore each group fully once a node from that group is encountered.
Time Complexity: O(n^2 * k) where n is the number of strings and k is the length of strings.
Space Complexity: O(n) to maintain the visited array.
We can enumerate any two strings s and t in the list of strings. Since s and t are anagrams, if the number of differing characters at corresponding positions between s and t does not exceed 2, then s and t are similar. We can use the union-find data structure to merge s and t. If the merge is successful, the number of similar string groups decreases by 1.
The final number of similar string groups is the number of connected components in the union-find structure.
Time complexity is O(n^2 times (m + \alpha(n))), and space complexity is O(n). Here, n and m are the length of the list of strings and the length of the strings, respectively, and \alpha(n) is the inverse Ackermann function, which can be considered a very small constant.
Python
Java
C++
Go
TypeScript
| Approach | Complexity |
|---|---|
| Union-Find Approach | Time Complexity: O(n^2 * k) where n is the number of strings and k is the average length of the strings. |
| Depth-First Search (DFS) Approach | Time Complexity: O(n^2 * k) where n is the number of strings and k is the length of strings. |
| Union-Find | — |
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Brute Force Pairwise Grouping | O(n^2 * m) | O(1) | Conceptual baseline when first analyzing similarity checks |
| Depth-First Search (Graph Components) | O(n^2 * m) | O(n) | When modeling the problem explicitly as a graph traversal |
| Union-Find (Disjoint Set) | O(n^2 * m) | O(n) | Best general solution for counting connected components efficiently |
Similar String Groups | Leetcode - 839 | DFS & BFS | AMAZON | Explanation ➕ Live Coding • codestorywithMIK • 6,243 views views
Watch 9 more video solutions →Practice Similar String Groups with our built-in code editor and test cases.
Practice on FleetCode