Given a string paragraph and a string array of the banned words banned, return the most frequent word that is not banned. It is guaranteed there is at least one word that is not banned, and that the answer is unique.
The words in paragraph are case-insensitive and the answer should be returned in lowercase.
Example 1:
Input: paragraph = "Bob hit a ball, the hit BALL flew far after it was hit.", banned = ["hit"] Output: "ball" Explanation: "hit" occurs 3 times, but it is a banned word. "ball" occurs twice (and no other word does), so it is the most frequent non-banned word in the paragraph. Note that words in the paragraph are not case sensitive, that punctuation is ignored (even if adjacent to words, such as "ball,"), and that "hit" isn't the answer even though it occurs more because it is banned.
Example 2:
Input: paragraph = "a.", banned = [] Output: "a"
Constraints:
1 <= paragraph.length <= 1000' ', or one of the symbols: "!?',;.".0 <= banned.length <= 1001 <= banned[i].length <= 10banned[i] consists of only lowercase English letters.Problem Overview: You receive a paragraph string and a list of banned words. The task is to return the most frequent word that is not banned. Words are case-insensitive and punctuation should be ignored, so parsing and normalization are key parts of the solution.
Approach 1: Frequency Count with HashMap (O(n) time, O(n) space)
This approach treats the paragraph as a stream of characters and builds words while scanning. Convert characters to lowercase and ignore punctuation. Each completed word is checked against a banned set using constant-time lookup. If the word is not banned, update its frequency in a HashMap (or dictionary). While counting, track the word with the highest frequency so you avoid a second pass over the map.
The key insight is that counting frequencies during a single pass over the text avoids repeated scanning. A hash-based structure gives O(1) average lookup for both banned checks and frequency updates. Total complexity is O(n) time where n is the paragraph length, and O(n) space for storing unique words. This approach relies heavily on hash table lookups and simple string processing.
Approach 2: Advanced String Manipulation and Collection (O(n) time, O(n) space)
This variation focuses on preprocessing the paragraph using string utilities. Replace punctuation characters with spaces, convert the text to lowercase, and split the paragraph into tokens. After tokenization, iterate through the resulting list of words and maintain counts in a map while skipping banned entries stored in a set.
The advantage of this approach is cleaner implementation in languages that support powerful string operations and collections. Splitting the text converts the problem into a straightforward array traversal with frequency counting. Each word update and banned lookup still runs in O(1) average time using hash structures. The overall complexity remains O(n) time and O(n) space because every character and token is processed once.
Recommended for interviews: The HashMap frequency counting approach is what most interviewers expect. It demonstrates that you can normalize input, use a banned set for fast filtering, and maintain counts efficiently with a map. Starting with the straightforward counting idea shows problem understanding, while implementing it in a single pass with proper string handling shows practical engineering skill.
This approach involves using a hash map to count the frequency of each word in the paragraph after converting it to lowercase and removing punctuation. Then, the word with the highest count that is not in the banned list is selected as the result.
This C solution uses an array of structures to store word frequencies, given C doesn't have default data structures for maps like other higher-level languages. It tokenizes the paragraph by spaces and punctuation, converts to lowercase, filters out banned words, and counts frequencies. The most frequent non-banned word is returned.
Time Complexity: O(N + M), where N is the length of the paragraph and M is the number of banned words. Space Complexity: O(N) for storing word frequencies.
This approach leverages advanced string manipulation functions available in each language for efficient parsing and counting. The words are extracted, normalized, and counted using advanced language-specific methods and libraries for cleaner code.
In this advanced C solution, we utilize qsort for sorting based on frequency after processing the paragraph. We tokenize the paragraph, convert to lowercase, check against banned set, and store in custom structure. The sorting provides the most frequent non-banned word.
Time Complexity: O(N log N) due to sorting, where N is total words extracted. Space Complexity: O(N).
| Approach | Complexity |
|---|---|
| Approach 1: Frequency Count with HashMap | Time Complexity: O(N + M), where N is the length of the paragraph and M is the number of banned words. Space Complexity: O(N) for storing word frequencies. |
| Approach 2: Advanced String Manipulation and Collection | Time Complexity: O(N log N) due to sorting, where N is total words extracted. Space Complexity: O(N). |
| Default Approach | — |
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Frequency Count with HashMap | O(n) | O(n) | General case. Efficient single-pass solution commonly expected in coding interviews. |
| Advanced String Manipulation and Collection | O(n) | O(n) | Useful when the language provides convenient string replace and split utilities. |
LeetCode Most Common Word Solution Explained - Java • Nick White • 10,813 views views
Watch 9 more video solutions →Practice Most Common Word with our built-in code editor and test cases.
Practice on FleetCode