Frequencies of Shortest Supersequences - Solution & Explanation

Q: Is Frequencies of Shortest Supersequences easy or hard?

Frequencies of Shortest Supersequences is classified as Hard because it combines multiple advanced ideas: graph modeling, cycle handling, topological sorting, and bitmask enumeration. The challenge lies in realizing that duplicating characters resolves ordering cycles and that enumerating duplication sets is tractable due to the small alphabet size.

Q: Frequencies of Shortest Supersequences Python/Java solution

Implement the algorithm by building the constraint graph, collecting all unique characters, and iterating over bitmasks representing duplicated characters. For each mask, validate the graph ordering using topological sort. Python typically uses lists and deque for Kahn's algorithm, while Java implementations use ArrayList and Queue structures.

Q: What is the best approach for Frequencies of Shortest Supersequences?

The most effective approach models character constraints as a directed graph and enumerates which characters must appear twice using a bitmask. For each mask, validate whether the ordering constraints form a valid DAG using topological sort. Among all valid masks, keep those producing the minimum sequence length. The time complexity is O(2^k * (V + E)), where k is the number of distinct characters.

Q: How to solve Frequencies of Shortest Supersequences in O(2^k * (V+E))?

Build a directed graph from the ordering pairs. Enumerate all subsets of characters using a bitmask where each bit represents whether a character appears twice. For every subset, run topological sort to verify the constraints can be satisfied. Track the minimum resulting length and record the frequency vectors corresponding to valid minimal masks.

Q: Is Frequencies of Shortest Supersequences asked at Google/Amazon/Meta?

Problems combining graph modeling, cycle resolution, and topological sorting frequently appear in interviews at companies like Google, Amazon, and Meta. Variants involving ordering constraints, dependency graphs, and minimal sequence construction are especially common in senior-level algorithm interviews.

Q: What data structure is used in Frequencies of Shortest Supersequences?

The solution primarily uses a directed graph represented with adjacency lists, along with a queue or stack for topological sorting (Kahn's algorithm or DFS). Bitmasks are used to efficiently enumerate subsets of characters that may appear twice. Arrays or maps track character frequencies.

Q: What is the time complexity of Frequencies of Shortest Supersequences?

The optimal solution runs in O(2^k * (V + E)) time. k represents the number of distinct characters that appear in the constraints, while V and E represent the number of vertices and edges in the ordering graph. Space complexity is O(V + E) for storing the graph and auxiliary structures used during topological sorting.

HardArray String Bit Manipulation Graph3 min readAsked at: PhonePe

Practice this problem

Problem Statement

You are given an array of strings words. Find all shortest common supersequences (SCS) of words that are not permutations of each other.

A shortest common supersequence is a string of minimum length that contains each string in words as a subsequence.

Return a 2D array of integers freqs that represent all the SCSs. Each freqs[i] is an array of size 26, representing the frequency of each letter in the lowercase English alphabet for a single SCS. You may return the frequency arrays in any order.

Example 1:

Input: words = ["ab","ba"]

Output: [[1,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],[2,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]]

Explanation:

The two SCSs are "aba" and "bab". The output is the letter frequencies for each one.

Example 2:

Input: words = ["aa","ac"]

Output: [[2,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]]

Explanation:

The two SCSs are "aac" and "aca". Since they are permutations of each other, keep only "aac".

Example 3:

Input: words = ["aa","bb","cc"]

Output: [[2,2,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]]

Explanation:

"aabbcc" and all its permutations are SCSs.

Constraints:

1 <= words.length <= 256
words[i].length == 2
All strings in words will altogether be composed of no more than 16 unique lowercase letters.
All strings in words are unique.

Approach Overview

Problem Overview: You are given ordering constraints between characters (for example "ab" meaning a must appear before b). The task is to construct the shortest possible supersequence that satisfies every constraint and return the possible frequency distributions of characters among all shortest valid supersequences.

The difficulty comes from cycles in the ordering graph. If a → b and b → a both exist, a single occurrence of each character cannot satisfy both constraints. The only way to resolve the cycle is to duplicate at least one character so it can appear both before and after another character.

Approach 1: Graph Modeling + Bitmask Enumeration + Topological Sort (O(2^k * (V + E)) time, O(V + E) space)

First model the constraints as a directed graph where each character is a node and every pair "ab" creates an edge a → b. Let k be the number of distinct characters involved. Each character must appear at least once, but some characters may need to appear twice to break cycles. Use bitmask enumeration over the k characters to represent which characters are duplicated.

For every mask, assume characters in the mask appear twice while others appear once. Then check whether the constraints can be satisfied without contradiction. This check is done using topological sort on the directed graph. If the ordering constraints among the single-occurrence characters create a cycle that cannot be resolved by the duplicated ones, discard the mask. Otherwise the mask represents a valid supersequence configuration.

Track the total length of the sequence for each mask: length = k + popcount(mask). Maintain the minimum possible length. For masks producing the minimum length, record the resulting frequency vector where each character has frequency 1 or 2 depending on the mask.

The enumeration works because the alphabet size is small. Bitmasking efficiently explores all duplication combinations while the graph validation step ensures constraints are satisfied. This combines techniques from graph algorithms, topological sorting, and bit manipulation.

Recommended for interviews: The bitmask + topological sort strategy is the expected solution. A naive attempt to generate actual supersequences explodes combinatorially. Recognizing that only which characters are duplicated matters reduces the search space dramatically. Enumerating duplication masks shows strong problem decomposition skills, while validating them with topological sorting demonstrates solid graph fundamentals.

Solutions for this problem are being prepared.

Try solving it yourself

Detailed Complexity Analysis

Approach	Time	Space	When to Use
Brute Force Supersequence Generation	Exponential	Exponential	Conceptual baseline only; impractical because the number of candidate sequences grows explosively
Graph Cycle Handling with Character Duplication Enumeration	O(2^k * (V + E))	O(V + E)	General optimal solution when the number of distinct characters is small
Bitmask + Topological Sort Validation	O(2^k * (V + E))	O(V + E)	Interview-preferred implementation using compact mask representation and DAG checks

Video Solution

3435. Frequencies of Shortest Supersequences (Leetcode Hard) • Programming Live with Larry • 633 views views

Watch 1 more video solutions →

Frequently Asked Questions

Is Frequencies of Shortest Supersequences easy or hard?

Frequencies of Shortest Supersequences is classified as Hard because it combines multiple advanced ideas: graph modeling, cycle handling, topological sorting, and bitmask enumeration. The challenge lies in realizing that duplicating characters resolves ordering cycles and that enumerating duplication sets is tractable due to the small alphabet size.

Frequencies of Shortest Supersequences Python/Java solution

Implement the algorithm by building the constraint graph, collecting all unique characters, and iterating over bitmasks representing duplicated characters. For each mask, validate the graph ordering using topological sort. Python typically uses lists and deque for Kahn's algorithm, while Java implementations use ArrayList and Queue structures.

What is the best approach for Frequencies of Shortest Supersequences?

The most effective approach models character constraints as a directed graph and enumerates which characters must appear twice using a bitmask. For each mask, validate whether the ordering constraints form a valid DAG using topological sort. Among all valid masks, keep those producing the minimum sequence length. The time complexity is O(2^k * (V + E)), where k is the number of distinct characters.

How to solve Frequencies of Shortest Supersequences in O(2^k * (V+E))?

Build a directed graph from the ordering pairs. Enumerate all subsets of characters using a bitmask where each bit represents whether a character appears twice. For every subset, run topological sort to verify the constraints can be satisfied. Track the minimum resulting length and record the frequency vectors corresponding to valid minimal masks.

Is Frequencies of Shortest Supersequences asked at Google/Amazon/Meta?

Problems combining graph modeling, cycle resolution, and topological sorting frequently appear in interviews at companies like Google, Amazon, and Meta. Variants involving ordering constraints, dependency graphs, and minimal sequence construction are especially common in senior-level algorithm interviews.

What data structure is used in Frequencies of Shortest Supersequences?

The solution primarily uses a directed graph represented with adjacency lists, along with a queue or stack for topological sorting (Kahn's algorithm or DFS). Bitmasks are used to efficiently enumerate subsets of characters that may appear twice. Arrays or maps track character frequencies.

What is the time complexity of Frequencies of Shortest Supersequences?

The optimal solution runs in O(2^k * (V + E)) time. k represents the number of distinct characters that appear in the constraints, while V and E represent the number of vertices and edges in the ordering graph. Space complexity is O(V + E) for storing the graph and auxiliary structures used during topological sorting.

Ready to solve this problem?

Practice Frequencies of Shortest Supersequences with our built-in code editor and test cases.

Practice on FleetCode

Two Sum

Median of Two Sorted Arrays

Problem Info

DifficultyHard

Acceptance22.2%

Approaches0

Reading time3 min

Asked at

PhonePe

Practice this problem

Open in Editor

Frequencies of Shortest Supersequences - Solution & Explanation

Problem Statement

Approach Overview

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Ready to solve this problem?

Problem Info

Table of Contents

Frequencies of Shortest Supersequences - Solution & Explanation

Problem Statement

Approach Overview

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Ready to solve this problem?

Problem Info

Table of Contents

Problem Statement

Approach Overview

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Related Problems

Ready to solve this problem?

Problem Info

Table of Contents

Problem Statement

Approach Overview

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Related Problems

Ready to solve this problem?

Problem Info

Table of Contents