Sponsored
Sponsored
In this approach, we treat each string as a node in a graph. Two nodes are connected if the corresponding strings are similar. We use a Union-Find data structure to efficiently group the strings into connected components based on similarity.
Time Complexity: O(n^2 * k) where n is the number of strings and k is the average length of the strings.
Space Complexity: O(n), storing the parent array.
1def find(parent, x):
2 if parent[x] != x:
3 parent[x] = find(parent, parent[x])
4 return parent[x]
5
6
7def union(parent, x, y):
8 rootX = find(parent, x)
9 rootY = find(parent, y)
10 if rootX != rootY:
11 parent[rootY] = rootX
12
13
14def are_similar(a, b):
15 diff = 0
16 for i in range(len(a)):
17 if a[i] != b[i]:
18 diff += 1
19 if diff > 2:
20 return False
21 return diff == 2 or diff == 0
22
23
24def num_similar_groups(strs):
25 n = len(strs)
26 parent = list(range(n))
27
28 for i in range(n):
29 for j in range(i + 1, n):
30 if are_similar(strs[i], strs[j]):
31 union(parent, i, j)
32
33 return len(set(find(parent, i) for i in range(n)))
34
35
36# Example usage:
37strs = ["tars", "rats", "arts", "star"]
38print(num_similar_groups(strs))
39
The Python implementation involves Union-Find operations for similarity checking and component grouping. The are_similar()
function ensures at most two mismatches for a pair of similar strings. Union and path compression make set operations efficient.
This approach considers each string as a node and treats detecting similarities like exploring components in a graph. We use a depth-first search (DFS) algorithm to explore nodes. If two strings are similar, we explore all strings similar to this pair within a recursive call.
Time Complexity: O(n^2 * k) where n is the number of strings and k is the length of strings.
Space Complexity: O(n) to maintain the visited array.
1
This Java DFS-based solution explores all connected components in the similarity graph. Upon visiting an unvisited node, the dfs
function delves into all connected nodes, ensuring complete exploration of the component leading to a count of distinct groups.