Watch 10 video solutions for Similar String Groups, a hard level problem involving Array, Hash Table, String. This walkthrough by codestorywithMIK has 6,243 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.
Two strings, X and Y, are considered similar if either they are identical or we can make them equivalent by swapping at most two letters (in distinct positions) within the string X.
For example, "tars" and "rats" are similar (swapping at positions 0 and 2), and "rats" and "arts" are similar, but "star" is not similar to "tars", "rats", or "arts".
Together, these form two connected groups by similarity: {"tars", "rats", "arts"} and {"star"}. Notice that "tars" and "arts" are in the same group even though they are not similar. Formally, each group is such that a word is in the group if and only if it is similar to at least one other word in the group.
We are given a list strs of strings where every string in strs is an anagram of every other string in strs. How many groups are there?
Example 1:
Input: strs = ["tars","rats","arts","star"] Output: 2
Example 2:
Input: strs = ["omv","ovm"] Output: 1
Constraints:
1 <= strs.length <= 3001 <= strs[i].length <= 300strs[i] consists of lowercase letters only.strs have the same length and are anagrams of each other.Problem Overview: You are given an array of strings where every string is an anagram of the others. Two strings are considered similar if you can swap exactly two characters in one string to make it equal to the other. The task is to count how many groups of similar strings exist.
The key observation: similarity is a transitive relationship. If a is similar to b, and b is similar to c, then all belong to the same group even if a and c are not directly similar. This naturally forms connected components in a graph.
Approach 1: Union-Find (Disjoint Set) (O(n2 * m) time, O(n) space)
Treat each string as a node. Compare every pair of strings and check whether they differ in exactly two positions (or zero). If they are similar, merge their sets using a union-find structure. The similarity check scans characters and records mismatched positions. If the mismatch count exceeds two, stop early. After processing all pairs, the number of unique parents in the disjoint set represents the number of groups. This approach works well because union-find efficiently merges connected components with near-constant amortized operations.
Approach 2: Depth-First Search Graph Traversal (O(n2 * m) time, O(n) space)
Model the problem as an undirected graph where an edge connects two strings if they are similar. Iterate through all string pairs and check similarity. Then run DFS from each unvisited node to mark all reachable nodes in that component. Each DFS traversal corresponds to one similar string group. The similarity check still costs O(m) per pair, so the total complexity becomes O(n^2 * m). This method is conceptually simple and easy to implement if you're comfortable with graph traversal.
Both approaches rely on pairwise comparisons because similarity depends on character positions. The array size n determines how many comparisons happen, while string length m determines the cost of each similarity check.
Recommended for interviews: The Union-Find approach is usually preferred. Interviewers expect you to recognize the connected-components pattern and apply a disjoint-set structure. Explaining the graph interpretation first shows clear reasoning, while implementing union-find demonstrates familiarity with a standard connectivity algorithm used across many array and graph problems.
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Brute Force Pairwise Grouping | O(n^2 * m) | O(1) | Conceptual baseline when first analyzing similarity checks |
| Depth-First Search (Graph Components) | O(n^2 * m) | O(n) | When modeling the problem explicitly as a graph traversal |
| Union-Find (Disjoint Set) | O(n^2 * m) | O(n) | Best general solution for counting connected components efficiently |