Watch 10 video solutions for String Compression II, a hard level problem involving String, Dynamic Programming. This walkthrough by NeetCodeIO has 34,060 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.
Run-length encoding is a string compression method that works by replacing consecutive identical characters (repeated 2 or more times) with the concatenation of the character and the number marking the count of the characters (length of the run). For example, to compress the string "aabccc" we replace "aa" by "a2" and replace "ccc" by "c3". Thus the compressed string becomes "a2bc3".
Notice that in this problem, we are not adding '1' after single characters.
Given a string s and an integer k. You need to delete at most k characters from s such that the run-length encoded version of s has minimum length.
Find the minimum length of the run-length encoded version of s after deleting at most k characters.
Example 1:
Input: s = "aaabcccd", k = 2 Output: 4 Explanation: Compressing s without deleting anything will give us "a3bc3d" of length 6. Deleting any of the characters 'a' or 'c' would at most decrease the length of the compressed string to 5, for instance delete 2 'a' then we will have s = "abcccd" which compressed is abc3d. Therefore, the optimal way is to delete 'b' and 'd', then the compressed version of s will be "a3c3" of length 4.
Example 2:
Input: s = "aabbaa", k = 2 Output: 2 Explanation: If we delete both 'b' characters, the resulting compressed string would be "a4" of length 2.
Example 3:
Input: s = "aaaaaaaaaaa", k = 0 Output: 3 Explanation: Since k is zero, we cannot delete anything. The compressed string is "a11" of length 3.
Constraints:
1 <= s.length <= 1000 <= k <= s.lengths contains only lowercase English letters.Problem Overview: Given a string s and an integer k, delete at most k characters so the run-length encoded (RLE) version of the remaining string is as short as possible. The challenge is that removing characters can merge groups or reduce digit counts in the compression length.
Approach 1: Dynamic Programming with Subproblems (O(n2 * k) time, O(n * k) space)
This problem fits naturally into dynamic programming. Define dp[i][k] as the minimum compressed length achievable starting at index i with k deletions remaining. From position i, iterate forward and try forming a group where s[i] is the kept character. Track the frequency of s[i] and count how many other characters must be deleted to maintain a continuous run. As the run grows, the encoded length increases only when the count crosses thresholds (1→2, 9→10, 99→100). For each extension, recursively compute the optimal result for the remainder of the string. Memoization avoids recomputing overlapping states. This approach systematically evaluates all valid compression segments while respecting the deletion budget.
Approach 2: Greedy Algorithm with Counting (O(n2) time, O(1) space)
A more heuristic approach scans segments and greedily keeps the most frequent character within a window while deleting others to form a longer run. Maintain a frequency counter while expanding a window from index i. The idea is to keep the character that appears most often and treat other characters as deletions. This reduces fragmentation in the compressed output because longer runs produce shorter RLE encodings. While expanding, compute how many deletions are required and stop when the budget is exceeded. The encoded length is derived from the run length using the same digit thresholds used in string compression. This strategy is simpler to implement but does not always guarantee optimality for every configuration.
Recommended for interviews: The dynamic programming solution is what interviewers expect. It demonstrates strong reasoning about compression boundaries, deletion tradeoffs, and optimal substructure. The greedy counting idea can help you reason about why grouping characters reduces encoded size, but the DP formulation is the reliable optimal solution for this hard problem.
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Dynamic Programming with Subproblems | O(n^2 * k) | O(n * k) | General case and interview setting where optimal compression length is required |
| Greedy Algorithm with Counting | O(n^2) | O(1) | Quick heuristic reasoning about runs and deletions when exploring compression strategies |