Watch 10 video solutions for Extra Characters in a String, a medium level problem involving Array, Hash Table, String. This walkthrough by NeetCodeIO has 23,906 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.
You are given a 0-indexed string s and a dictionary of words dictionary. You have to break s into one or more non-overlapping substrings such that each substring is present in dictionary. There may be some extra characters in s which are not present in any of the substrings.
Return the minimum number of extra characters left over if you break up s optimally.
Example 1:
Input: s = "leetscode", dictionary = ["leet","code","leetcode"] Output: 1 Explanation: We can break s in two substrings: "leet" from index 0 to 3 and "code" from index 5 to 8. There is only 1 unused character (at index 4), so we return 1.
Example 2:
Input: s = "sayhelloworld", dictionary = ["hello","world"] Output: 3 Explanation: We can break s in two substrings: "hello" from index 3 to 7 and "world" from index 8 to 12. The characters at indices 0, 1, 2 are not used in any substring and thus are considered as extra characters. Hence, we return 3.
Constraints:
1 <= s.length <= 501 <= dictionary.length <= 501 <= dictionary[i].length <= 50dictionary[i] and s consists of only lowercase English lettersdictionary contains distinct wordsProblem Overview: You are given a string s and a dictionary of valid words. The goal is to split the string into dictionary words so that the number of leftover characters (characters not part of any word) is minimized.
Approach 1: Dynamic Programming (O(n²) time, O(n) space)
This approach treats the problem as a prefix optimization task using dynamic programming. Create a DP array where dp[i] represents the minimum number of extra characters in the substring starting at index i. Iterate from the end of the string toward the beginning. At each position, assume the current character is extra (1 + dp[i+1]), then check every substring s[i:j]. If the substring exists in the dictionary (stored in a hash table for O(1) lookups), update dp[i] with dp[j]. This effectively tries all valid word breaks while minimizing leftover characters. Time complexity is O(n²) due to checking substrings, and space complexity is O(n) for the DP array.
Approach 2: Trie and Memoization (O(n * L) average time, O(n + totalDictChars) space)
This version improves substring checking using a Trie. Insert all dictionary words into the Trie, then traverse the string while simultaneously walking the Trie. From index i, explore characters forward as long as they match a Trie path. Every time you hit a word-ending node, recursively compute the result for the remaining suffix. Memoization caches results for each index so the same suffix is never recomputed. The key advantage is that substring checks become incremental character traversals instead of repeated string slicing. The time complexity is roughly O(n * L), where L is the maximum dictionary word length, while space complexity includes the memo array plus Trie storage.
Recommended for interviews: The dynamic programming solution is typically expected because it clearly demonstrates optimal substructure and efficient state transitions. It is straightforward to implement and easy to reason about during an interview. The Trie + memoization approach shows deeper knowledge of string optimization and can reduce redundant substring checks when the dictionary contains many overlapping prefixes.
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Dynamic Programming with Hash Set | O(n²) | O(n) | General case. Simple implementation and common interview expectation. |
| Trie with Memoization | O(n * L) | O(n + dictionary size) | When dictionary has many overlapping prefixes and repeated substring checks become expensive. |