Run-length encoding is a string compression method that works by replacing consecutive identical characters (repeated 2 or more times) with the concatenation of the character and the number marking the count of the characters (length of the run). For example, to compress the string "aabccc" we replace "aa" by "a2" and replace "ccc" by "c3". Thus the compressed string becomes "a2bc3".
Notice that in this problem, we are not adding '1' after single characters.
Given a string s and an integer k. You need to delete at most k characters from s such that the run-length encoded version of s has minimum length.
Find the minimum length of the run-length encoded version of s after deleting at most k characters.
Example 1:
Input: s = "aaabcccd", k = 2 Output: 4 Explanation: Compressing s without deleting anything will give us "a3bc3d" of length 6. Deleting any of the characters 'a' or 'c' would at most decrease the length of the compressed string to 5, for instance delete 2 'a' then we will have s = "abcccd" which compressed is abc3d. Therefore, the optimal way is to delete 'b' and 'd', then the compressed version of s will be "a3c3" of length 4.
Example 2:
Input: s = "aabbaa", k = 2 Output: 2 Explanation: If we delete both 'b' characters, the resulting compressed string would be "a4" of length 2.
Example 3:
Input: s = "aaaaaaaaaaa", k = 0 Output: 3 Explanation: Since k is zero, we cannot delete anything. The compressed string is "a11" of length 3.
Constraints:
1 <= s.length <= 1000 <= k <= s.lengths contains only lowercase English letters.Problem Overview: Given a string s and an integer k, delete at most k characters so the run-length encoded (RLE) version of the remaining string is as short as possible. The challenge is that removing characters can merge groups or reduce digit counts in the compression length.
Approach 1: Dynamic Programming with Subproblems (O(n2 * k) time, O(n * k) space)
This problem fits naturally into dynamic programming. Define dp[i][k] as the minimum compressed length achievable starting at index i with k deletions remaining. From position i, iterate forward and try forming a group where s[i] is the kept character. Track the frequency of s[i] and count how many other characters must be deleted to maintain a continuous run. As the run grows, the encoded length increases only when the count crosses thresholds (1→2, 9→10, 99→100). For each extension, recursively compute the optimal result for the remainder of the string. Memoization avoids recomputing overlapping states. This approach systematically evaluates all valid compression segments while respecting the deletion budget.
Approach 2: Greedy Algorithm with Counting (O(n2) time, O(1) space)
A more heuristic approach scans segments and greedily keeps the most frequent character within a window while deleting others to form a longer run. Maintain a frequency counter while expanding a window from index i. The idea is to keep the character that appears most often and treat other characters as deletions. This reduces fragmentation in the compressed output because longer runs produce shorter RLE encodings. While expanding, compute how many deletions are required and stop when the budget is exceeded. The encoded length is derived from the run length using the same digit thresholds used in string compression. This strategy is simpler to implement but does not always guarantee optimality for every configuration.
Recommended for interviews: The dynamic programming solution is what interviewers expect. It demonstrates strong reasoning about compression boundaries, deletion tradeoffs, and optimal substructure. The greedy counting idea can help you reason about why grouping characters reduces encoded size, but the DP formulation is the reliable optimal solution for this hard problem.
This approach uses dynamic programming to minimize the length of the compressed string after removing up to k characters. We'll iteratively decide for each position in the string whether to remove the character or keep it, and if kept, determine the optimal compression length.
The idea is to use a DP table where dp[i][j] stores the minimal compression length achievable for the substring s[0...i] with j deletions.
The solution defines a helper function compressLength which calculates the length of the compressed representation given a count. The main function minLength initializes a DP array and iteratively fills it by considering each character, counting contiguous appearances, and updating the DP state to find the minimum compression lengths achievable with allowed deletions.
Time Complexity: O(n^3). For each starting point, we may need to process every subsequent character and make decisions up to k deletions.
Space Complexity: O(n*k). Space used by the DP table.
In this greedy approach, we identify segments of identical characters and evaluate them individually, considering their entire group in the adjustments. By counting the length of these segments, we can make informed decisions about deletions that do not depend on the widely traversed DP table, reducing some overhead complexity.
This C solution uses a frequency table to count occurrences of each character and adjusts the counts after sorting them. By decreasing the most frequent occurrences, we fit within the limit of deletions. The resulting array gives us a simple method to calculate the resulting minimal compressed length.
Time Complexity: O(n + m log m) where n is the length of the string and m = 26 (constant time, simplified in this case).
Space Complexity: O(m) for frequency arrays.
Java
| Approach | Complexity |
|---|---|
| Approach 1: Dynamic Programming with Subproblems | Time Complexity: O(n^3). For each starting point, we may need to process every subsequent character and make decisions up to k deletions. Space Complexity: O(n*k). Space used by the DP table. |
| Approach 2: Greedy Algorithm with Counting | Time Complexity: O(n + m log m) where n is the length of the string and m = 26 (constant time, simplified in this case). Space Complexity: O(m) for frequency arrays. |
| Default Approach | — |
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Dynamic Programming with Subproblems | O(n^2 * k) | O(n * k) | General case and interview setting where optimal compression length is required |
| Greedy Algorithm with Counting | O(n^2) | O(1) | Quick heuristic reasoning about runs and deletions when exploring compression strategies |
String Compression II - Leetcode 1531 - Python • NeetCodeIO • 34,060 views views
Watch 9 more video solutions →Practice String Compression II with our built-in code editor and test cases.
Practice on FleetCode