Two strings, X and Y, are considered similar if either they are identical or we can make them equivalent by swapping at most two letters (in distinct positions) within the string X.
For example, "tars" and "rats" are similar (swapping at positions 0 and 2), and "rats" and "arts" are similar, but "star" is not similar to "tars", "rats", or "arts".
Together, these form two connected groups by similarity: {"tars", "rats", "arts"} and {"star"}. Notice that "tars" and "arts" are in the same group even though they are not similar. Formally, each group is such that a word is in the group if and only if it is similar to at least one other word in the group.
We are given a list strs of strings where every string in strs is an anagram of every other string in strs. How many groups are there?
Example 1:
Input: strs = ["tars","rats","arts","star"] Output: 2
Example 2:
Input: strs = ["omv","ovm"] Output: 1
Constraints:
1 <= strs.length <= 3001 <= strs[i].length <= 300strs[i] consists of lowercase letters only.strs have the same length and are anagrams of each other.In #839 Similar String Groups, the goal is to determine how many groups of strings are connected through similarity. Two strings are considered similar if they are identical or can become identical by swapping exactly two characters. This naturally forms a graph where each string is a node and an edge exists if two strings are similar.
A common strategy is to compare every pair of strings and connect them using Union Find (Disjoint Set) or by building components with DFS/BFS. When two strings differ at only two positions (or zero), they belong to the same group. Union Find efficiently merges such pairs and tracks the number of connected components.
The main challenge is efficiently checking similarity between strings of equal length. Since every pair may need to be compared, the typical complexity becomes O(n^2 * m), where n is the number of strings and m is the string length. Space complexity is usually O(n) for the disjoint set or visited structures.
| Approach | Time Complexity | Space Complexity |
|---|---|---|
| Union Find (Disjoint Set) | O(n^2 * m) | O(n) |
| DFS/BFS Graph Traversal | O(n^2 * m) | O(n) |
NeetCode
In this approach, we treat each string as a node in a graph. Two nodes are connected if the corresponding strings are similar. We use a Union-Find data structure to efficiently group the strings into connected components based on similarity.
Time Complexity: O(n^2 * k) where n is the number of strings and k is the average length of the strings.
Space Complexity: O(n), storing the parent array.
1using System;
2
3class Solution {
4 private int Find(int[] parent, int x) {
5 if (parent[x] != x) parent[x] = Find(parent, parent[x]);
6 return parent[x];
7 }
8
9 private void Union(int[] parent, int x, int y) {
10 int rootX = Find(parent, x);
11 int rootY = Find(parent, y);
12 if (rootX != rootY) parent[rootY] = rootX;
13 }
14
15 private bool AreSimilar(string a, string b) {
16 int diff = 0;
17 for (int i = 0; i < a.Length; i++) {
18 if (a[i] != b[i]) {
19 diff++;
20 if (diff > 2) return false;
21 }
22 }
23 return diff == 2 || diff == 0;
24 }
25
26 public int NumSimilarGroups(string[] strs) {
27 int n = strs.Length;
28 int[] parent = new int[n];
29 for (int i = 0; i < n; i++) {
30 parent[i] = i;
31 }
32
33 for (int i = 0; i < n; i++) {
34 for (int j = i + 1; j < n; j++) {
35 if (AreSimilar(strs[i], strs[j])) {
36 Union(parent, i, j);
37 }
38 }
39 }
40
41 int count = 0;
42 for (int i = 0; i < n; i++) {
43 if (Find(parent, i) == i) count++;
44 }
45 return count;
46 }
47
48 static void Main() {
49 Solution s = new Solution();
50 string[] strs = {"tars", "rats", "arts", "star"};
51 Console.WriteLine(s.NumSimilarGroups(strs));
52 }
53}
54The C# code uses Union-Find with the path compression technique for optimal performance. The AreSimilar() function ensures that two strings can be in the same group if they allow for only two mismatches, thus allowing swapping to form similarity.
This approach considers each string as a node and treats detecting similarities like exploring components in a graph. We use a depth-first search (DFS) algorithm to explore nodes. If two strings are similar, we explore all strings similar to this pair within a recursive call.
Time Complexity: O(n^2 * k) where n is the number of strings and k is the length of strings.
Space Complexity: O(n) to maintain the visited array.
1
Watch expert explanations and walkthroughs
Practice problems asked by these companies to ace your technical interviews.
Explore More ProblemsJot down your thoughts, approach, and key learnings
Yes, Similar String Groups is considered an advanced graph/union-find problem and reflects patterns commonly asked in FAANG-level interviews. It tests understanding of connectivity, graph modeling, and efficient grouping techniques.
The most common optimal approach uses Union Find (Disjoint Set Union). By comparing pairs of strings and unioning those that are similar, we can efficiently count the number of connected components representing groups.
Union Find is typically the best data structure because it efficiently merges groups and tracks connected components. Graph traversal with DFS or BFS is also a valid alternative.
Two strings are similar if they differ in exactly two positions or not at all. This means one string can be transformed into the other with a single swap of characters.
This C solution employs DFS to exhaustively search connected components formed by similar strings. We use a boolean array to track visited nodes and explore each group fully once a node from that group is encountered.