Watch the video solution for Better Compression of String, a medium level problem involving Hash Table, String, Sorting. This walkthrough by Programming Live with Larry has 564 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.
You are given a string compressed representing a compressed version of a string. The format is a character followed by its frequency. For example, "a3b1a1c2" is a compressed version of the string "aaabacc".
We seek a better compression with the following conditions:
Return the better compression of compressed.
Note: In the better version of compression, the order of letters may change, which is acceptable.
Example 1:
Input: compressed = "a3c9b2c1"
Output: "a3b2c10"
Explanation:
Characters "a" and "b" appear only once in the input, but "c" appears twice, once with a size of 9 and once with a size of 1.
Hence, in the resulting string, it should have a size of 10.
Example 2:
Input: compressed = "c2b3a1"
Output: "a1b3c2"
Example 3:
Input: compressed = "a2b4c1"
Output: "a2b4c1"
Constraints:
1 <= compressed.length <= 6 * 104compressed consists only of lowercase English letters and digits.compressed is a valid compression, i.e., each character is followed by its frequency.[1, 104] and have no leading zeroes.Problem Overview: You receive a compressed string such as a12b3a2, where each character is followed by a positive integer representing its frequency. Characters may appear multiple times and are not guaranteed to be in sorted order. The task is to combine the counts for identical characters and return a new compressed string with characters sorted alphabetically.
Approach 1: Hash Table + Parsing + Sorting (O(n + k log k) time, O(k) space)
Scan the string and extract each character with its numeric frequency. Use two pointers: one pointer reads the character and the second pointer advances through consecutive digits to build the full number. Store the accumulated frequency in a hash table where the key is the character and the value is the total count. After processing the entire string, sort the unique characters alphabetically and rebuild the compressed string using the aggregated counts. This approach is simple and flexible when the alphabet size is not fixed.
Approach 2: Counting Array + Two Pointers (O(n) time, O(1) space)
Since the problem deals with lowercase English letters, you can replace the hash table with a fixed array of size 26. Traverse the string using the same two‑pointer parsing strategy: read a character, then iterate through digits to construct its frequency. Convert the character to an index (c - 'a') and accumulate the count in the array. After parsing the entire string, iterate from a to z and append each character with its final count if the value is non‑zero. Because the alphabet size is constant, the result is naturally sorted and avoids an explicit sorting step.
The key implementation detail is parsing multi‑digit counts correctly. When you encounter a digit, repeatedly multiply the current number by 10 and add the next digit until a non‑digit character appears. This technique is common in string parsing problems and ensures counts like a123 are interpreted properly.
Recommended for interviews: The counting array approach is usually preferred. It runs in linear time O(n) with constant space and avoids sorting entirely. Starting with the hash table solution still demonstrates solid problem decomposition and understanding of frequency aggregation, but moving to the fixed-size counting structure shows awareness of constraints and optimization.
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Hash Table + Parsing + Sorting | O(n + k log k) | O(k) | General case when character set is not limited or when using a flexible frequency map |
| Counting Array + Two Pointers | O(n) | O(1) | Best when characters are limited (e.g., lowercase a–z) and sorted output is required |