#839 Similar String Groups - Solution

Two strings, X and Y, are considered similar if either they are identical or we can make them equivalent by swapping at most two letters (in distinct positions) within the string X.

For example, "tars" and "rats" are similar (swapping at positions 0 and 2), and "rats" and "arts" are similar, but "star" is not similar to "tars", "rats", or "arts".

Together, these form two connected groups by similarity: {"tars", "rats", "arts"} and {"star"}. Notice that "tars" and "arts" are in the same group even though they are not similar. Formally, each group is such that a word is in the group if and only if it is similar to at least one other word in the group.

We are given a list strs of strings where every string in strs is an anagram of every other string in strs. How many groups are there?

Example 1:

Input: strs = ["tars","rats","arts","star"]
Output: 2

Example 2:

Input: strs = ["omv","ovm"]
Output: 1

Constraints:

1 <= strs.length <= 300
1 <= strs[i].length <= 300
strs[i] consists of lowercase letters only.
All words in strs have the same length and are anagrams of each other.

In #839 Similar String Groups, the goal is to determine how many groups of strings are connected through similarity. Two strings are considered similar if they are identical or can become identical by swapping exactly two characters. This naturally forms a graph where each string is a node and an edge exists if two strings are similar.

A common strategy is to compare every pair of strings and connect them using Union Find (Disjoint Set) or by building components with DFS/BFS. When two strings differ at only two positions (or zero), they belong to the same group. Union Find efficiently merges such pairs and tracks the number of connected components.

The main challenge is efficiently checking similarity between strings of equal length. Since every pair may need to be compared, the typical complexity becomes O(n^2 * m), where n is the number of strings and m is the string length. Space complexity is usually O(n) for the disjoint set or visited structures.

Approach	Time Complexity	Space Complexity
Union Find (Disjoint Set)	O(n^2 * m)	O(n)
DFS/BFS Graph Traversal	O(n^2 * m)	O(n)

Solutions (12)

Union-Find Approach

In this approach, we treat each string as a node in a graph. Two nodes are connected if the corresponding strings are similar. We use a Union-Find data structure to efficiently group the strings into connected components based on similarity.

Time Complexity: O(n^2 * k) where n is the number of strings and k is the average length of the strings.
Space Complexity: O(n), storing the parent array.

C C++Java Python C#JavaScript

1using System;
2
3class Solution {
4    private int Find(int[] parent, int x) {
5        if (parent[x] != x) parent[x] = Find(parent, parent[x]);
6        return parent[x];
7    }
8
9    private void Union(int[] parent, int x, int y) {
10        int rootX = Find(parent, x);
11        int rootY = Find(parent, y);
12        if (rootX != rootY) parent[rootY] = rootX;
13    }
14
15    private bool AreSimilar(string a, string b) {
16        int diff = 0;
17        for (int i = 0; i < a.Length; i++) {
18            if (a[i] != b[i]) {
19                diff++;
20                if (diff > 2) return false;
21            }
22        }
23        return diff == 2 || diff == 0;
24    }
25
26    public int NumSimilarGroups(string[] strs) {
27        int n = strs.Length;
28        int[] parent = new int[n];
29        for (int i = 0; i < n; i++) {
30            parent[i] = i;
31        }
32
33        for (int i = 0; i < n; i++) {
34            for (int j = i + 1; j < n; j++) {
35                if (AreSimilar(strs[i], strs[j])) {
36                    Union(parent, i, j);
37                }
38            }
39        }
40
41        int count = 0;
42        for (int i = 0; i < n; i++) {
43            if (Find(parent, i) == i) count++;
44        }
45        return count;
46    }
47
48    static void Main() {
49        Solution s = new Solution();
50        string[] strs = {"tars", "rats", "arts", "star"};
51        Console.WriteLine(s.NumSimilarGroups(strs));
52    }
53}
54

Explanation

The C# code uses Union-Find with the path compression technique for optimal performance. The AreSimilar() function ensures that two strings can be in the same group if they allow for only two mismatches, thus allowing swapping to form similarity.

Depth-First Search (DFS) Approach

This approach considers each string as a node and treats detecting similarities like exploring components in a graph. We use a depth-first search (DFS) algorithm to explore nodes. If two strings are similar, we explore all strings similar to this pair within a recursive call.

Time Complexity: O(n^2 * k) where n is the number of strings and k is the length of strings.
Space Complexity: O(n) to maintain the visited array.

C C++Java Python C#JavaScript