You are given an array of strings words. Find all shortest common supersequences (SCS) of words that are not permutations of each other.
A shortest common supersequence is a string of minimum length that contains each string in words as a subsequence.
Return a 2D array of integers freqs that represent all the SCSs. Each freqs[i] is an array of size 26, representing the frequency of each letter in the lowercase English alphabet for a single SCS. You may return the frequency arrays in any order.
Example 1:
Input: words = ["ab","ba"]
Output: [[1,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],[2,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]]
Explanation:
The two SCSs are "aba" and "bab". The output is the letter frequencies for each one.
Example 2:
Input: words = ["aa","ac"]
Output: [[2,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]]
Explanation:
The two SCSs are "aac" and "aca". Since they are permutations of each other, keep only "aac".
Example 3:
Input: words = ["aa","bb","cc"]
Output: [[2,2,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]]
Explanation:
"aabbcc" and all its permutations are SCSs.
Constraints:
1 <= words.length <= 256words[i].length == 2words will altogether be composed of no more than 16 unique lowercase letters.words are unique.Problem Overview: You are given ordering constraints between characters (for example "ab" meaning a must appear before b). The task is to construct the shortest possible supersequence that satisfies every constraint and return the possible frequency distributions of characters among all shortest valid supersequences.
The difficulty comes from cycles in the ordering graph. If a → b and b → a both exist, a single occurrence of each character cannot satisfy both constraints. The only way to resolve the cycle is to duplicate at least one character so it can appear both before and after another character.
Approach 1: Graph Modeling + Bitmask Enumeration + Topological Sort (O(2^k * (V + E)) time, O(V + E) space)
First model the constraints as a directed graph where each character is a node and every pair "ab" creates an edge a → b. Let k be the number of distinct characters involved. Each character must appear at least once, but some characters may need to appear twice to break cycles. Use bitmask enumeration over the k characters to represent which characters are duplicated.
For every mask, assume characters in the mask appear twice while others appear once. Then check whether the constraints can be satisfied without contradiction. This check is done using topological sort on the directed graph. If the ordering constraints among the single-occurrence characters create a cycle that cannot be resolved by the duplicated ones, discard the mask. Otherwise the mask represents a valid supersequence configuration.
Track the total length of the sequence for each mask: length = k + popcount(mask). Maintain the minimum possible length. For masks producing the minimum length, record the resulting frequency vector where each character has frequency 1 or 2 depending on the mask.
The enumeration works because the alphabet size is small. Bitmasking efficiently explores all duplication combinations while the graph validation step ensures constraints are satisfied. This combines techniques from graph algorithms, topological sorting, and bit manipulation.
Recommended for interviews: The bitmask + topological sort strategy is the expected solution. A naive attempt to generate actual supersequences explodes combinatorially. Recognizing that only which characters are duplicated matters reduces the search space dramatically. Enumerating duplication masks shows strong problem decomposition skills, while validating them with topological sorting demonstrates solid graph fundamentals.
Solutions for this problem are being prepared.
Try solving it yourself| Approach | Time | Space | When to Use |
|---|---|---|---|
| Brute Force Supersequence Generation | Exponential | Exponential | Conceptual baseline only; impractical because the number of candidate sequences grows explosively |
| Graph Cycle Handling with Character Duplication Enumeration | O(2^k * (V + E)) | O(V + E) | General optimal solution when the number of distinct characters is small |
| Bitmask + Topological Sort Validation | O(2^k * (V + E)) | O(V + E) | Interview-preferred implementation using compact mask representation and DAG checks |
3435. Frequencies of Shortest Supersequences (Leetcode Hard) • Programming Live with Larry • 633 views views
Watch 1 more video solutions →Practice Frequencies of Shortest Supersequences with our built-in code editor and test cases.
Practice on FleetCode