Watch 3 video solutions for Groups of Strings, a hard level problem involving String, Bit Manipulation, Union Find. This walkthrough by Coding Decoded has 1,581 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.
You are given a 0-indexed array of strings words. Each string consists of lowercase English letters only. No letter occurs more than once in any string of words.
Two strings s1 and s2 are said to be connected if the set of letters of s2 can be obtained from the set of letters of s1 by any one of the following operations:
s1.s1.s1 with any letter, including itself.The array words can be divided into one or more non-intersecting groups. A string belongs to a group if any one of the following is true:
Note that the strings in words should be grouped in such a manner that a string belonging to a group cannot be connected to a string present in any other group. It can be proved that such an arrangement is always unique.
Return an array ans of size 2 where:
ans[0] is the maximum number of groups words can be divided into, andans[1] is the size of the largest group.
Example 1:
Input: words = ["a","b","ab","cde"] Output: [2,3] Explanation: - words[0] can be used to obtain words[1] (by replacing 'a' with 'b'), and words[2] (by adding 'b'). So words[0] is connected to words[1] and words[2]. - words[1] can be used to obtain words[0] (by replacing 'b' with 'a'), and words[2] (by adding 'a'). So words[1] is connected to words[0] and words[2]. - words[2] can be used to obtain words[0] (by deleting 'b'), and words[1] (by deleting 'a'). So words[2] is connected to words[0] and words[1]. - words[3] is not connected to any string in words. Thus, words can be divided into 2 groups ["a","b","ab"] and ["cde"]. The size of the largest group is 3.
Example 2:
Input: words = ["a","ab","abc"] Output: [1,3] Explanation: - words[0] is connected to words[1]. - words[1] is connected to words[0] and words[2]. - words[2] is connected to words[1]. Since all strings are connected to each other, they should be grouped together. Thus, the size of the largest group is 3.
Constraints:
1 <= words.length <= 2 * 1041 <= words[i].length <= 26words[i] consists of lowercase English letters only.words[i].Problem Overview: You are given an array of lowercase strings. Two strings belong to the same group if you can transform one into the other by adding, removing, or replacing exactly one character. The task is to compute the number of groups and the size of the largest group.
Approach 1: Union-Find with Bitmask Representation (O(n * 26) time, O(n) space)
Each string contains unique characters, which means the set of characters can be encoded as a 26-bit integer. Bit i indicates whether character 'a' + i exists in the string. Using this representation, adding or removing a character becomes a single bit toggle operation. You maintain a hash map from bitmask to the index of a string and use a Union-Find structure to merge connected strings.
For every string mask, generate neighbors by toggling each of the 26 bits to simulate adding or removing a character. If the resulting mask exists in the map, union the two indices. Replacement is handled by removing one bit and adding another; this can be detected by checking intermediate masks generated during deletion. Since each string generates at most 26 variations, the algorithm runs in roughly O(n * 26) operations with near-constant Union-Find merges.
This approach is highly efficient because bit operations are constant time and the disjoint-set structure quickly merges connected components. It works especially well when the alphabet size is small and fixed. The technique combines bit manipulation with connectivity tracking.
Approach 2: Graph Representation and BFS/DFS (O(n * 26) average, higher constant factors, O(n) space)
Another way to view the problem is as a graph where each string is a node and edges connect strings that differ by one allowed operation. Convert every string into a bitmask and store them in a hash set for quick lookup. For each node, generate all possible masks that differ by adding, removing, or replacing one character. If a generated mask exists, connect the nodes.
Once the implicit graph is built, run BFS or DFS from each unvisited node to discover a connected component. Track the component size while traversing neighbors generated through bit operations. Although the theoretical complexity is similar, repeatedly exploring neighbors during traversal leads to larger constant overhead compared with the Union-Find method.
This method is conceptually simpler because it treats the problem as a standard connected-components search on a graph derived from string transformations.
Recommended for interviews: The Union-Find with bitmask approach is what most interviewers expect for this problem. It demonstrates understanding of bit-level encoding, efficient neighbor generation, and dynamic connectivity. A brute-force pairwise comparison shows the transformation rule, but scaling it to large input requires the optimized bitmask + Union-Find design.
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Union-Find with Bitmask Representation | O(n * 26) | O(n) | Best general solution. Efficient grouping using bit operations and disjoint sets. |
| Graph Representation with BFS/DFS | O(n * 26) average | O(n) | Useful when modeling the problem as connected components for conceptual clarity. |