Watch 10 video solutions for Minimum Length of Anagram Concatenation, a medium level problem involving Hash Table, String, Counting. This walkthrough by Aryan Mittal has 4,681 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.
You are given a string s, which is known to be a concatenation of anagrams of some string t.
Return the minimum possible length of the string t.
An anagram is formed by rearranging the letters of a string. For example, "aab", "aba", and, "baa" are anagrams of "aab".
Example 1:
Input: s = "abba"
Output: 2
Explanation:
One possible string t could be "ba".
Example 2:
Input: s = "cdef"
Output: 4
Explanation:
One possible string t could be "cdef", notice that t can be equal to s.
Constraints:
1 <= s.length <= 105s consist only of lowercase English letters.Problem Overview: You are given a string s. The string is formed by concatenating multiple substrings that are all anagrams of the same base string. Your task is to compute the minimum possible length of that base string.
The key observation: if a base string t of length k exists, then s must be split into n / k segments. Every segment must contain the same character frequency as t, just in a different order.
Approach 1: Frequency Analysis and GCD Method (O(n) time, O(1) space)
This approach relies on counting how many times each character appears in the entire string. Suppose character c appears count[c] times. If the string is composed of m anagram blocks, then the base string must contain count[c] / m occurrences of that character. For this to work, m must divide every character count.
Compute the greatest common divisor (GCD) of all character frequencies. That GCD represents the maximum number of identical anagram blocks the string can be split into. The minimum base string length becomes the sum of count[c] / gcd for all characters. Implementation uses a simple frequency array or hash map from Hash Table counting combined with a GCD calculation.
This works because each block must share identical frequency proportions. By normalizing counts with the GCD, you derive the smallest possible frequency profile that can repeat to generate the entire string.
Approach 2: Divisibility and Modular Arithmetic Analysis (O(n * d) time, O(1) space)
This method explicitly tests candidate base lengths. Let n be the length of the string. Any valid base length must divide n. For each divisor k, treat the first k characters as the frequency template. Then iterate through the string in blocks of size k, counting characters and checking if each block is an anagram of the template.
Character frequency comparisons can be implemented with arrays or counters from Counting. If all segments match, k is a valid answer. The smallest valid k is the minimum base length.
This approach is more explicit and easier to reason about during debugging. However, it performs repeated scans of the string and therefore runs slower than the GCD trick in worst cases.
Recommended for interviews: The GCD-based frequency analysis is the optimal approach and runs in O(n). Interviewers often expect you to recognize that anagram blocks imply proportional frequency counts across the entire string. Showing the divisor-check method first demonstrates clear reasoning, but the GCD insight shows stronger algorithmic intuition.
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Frequency Analysis and GCD Method | O(n) | O(1) | Best general solution. Uses global frequency counts to compute the minimal base pattern directly. |
| Divisibility and Modular Arithmetic Analysis | O(n * d) | O(1) | Useful for reasoning or debugging. Explicitly checks each valid segment length divisor. |