Watch 10 video solutions for Top K Frequent Words, a medium level problem involving Hash Table, String, Trie. This walkthrough by Cracking FAANG has 7,892 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.
Given an array of strings words and an integer k, return the k most frequent strings.
Return the answer sorted by the frequency from highest to lowest. Sort the words with the same frequency by their lexicographical order.
Example 1:
Input: words = ["i","love","leetcode","i","love","coding"], k = 2 Output: ["i","love"] Explanation: "i" and "love" are the two most frequent words. Note that "i" comes before "love" due to a lower alphabetical order.
Example 2:
Input: words = ["the","day","is","sunny","the","the","the","sunny","is","is"], k = 4 Output: ["the","is","sunny","day"] Explanation: "the", "is", "sunny" and "day" are the four most frequent words, with the number of occurrence being 4, 3, 2 and 1 respectively.
Constraints:
1 <= words.length <= 5001 <= words[i].length <= 10words[i] consists of lowercase English letters.k is in the range [1, The number of unique words[i]]
Follow-up: Could you solve it in O(n log(k)) time and O(n) extra space?
Problem Overview: Given an array of words, return the k most frequent words. Words with higher frequency come first, and when two words have the same frequency, the lexicographically smaller word appears earlier. The challenge is efficiently counting frequencies and ordering results by both frequency and alphabetical order.
Approach 1: Hash Map and Sorting (Time: O(n log n), Space: O(n))
The straightforward solution uses a frequency map. First iterate through the input array and count occurrences using a hash table. This step runs in O(n). Next convert the map entries into a list and sort them using a custom comparator: higher frequency first, and if frequencies match, compare words lexicographically.
The sorting step dominates the runtime at O(m log m), where m is the number of unique words (worst case m = n). The comparator ensures correct ordering without additional passes. After sorting, simply take the first k words. This approach is easy to implement and works well when the number of unique words is moderate.
Because the heavy lifting is done by sorting, this method is reliable and readable. Many interview candidates start with this solution because it clearly separates counting and ordering logic.
Approach 2: Min-Heap (Time: O(n log k), Space: O(n))
When k is much smaller than the number of unique words, a heap reduces unnecessary sorting. First build the same frequency map using a hash table in O(n). Then iterate through each unique word and push it into a size-limited min-heap based on frequency.
The heap stores at most k elements. The comparator prioritizes smaller frequency, and when frequencies tie, the lexicographically larger word is considered smaller in the heap so it gets removed first. Whenever the heap size exceeds k, pop the smallest element. This keeps only the top k candidates.
Each heap insertion or removal costs O(log k), so processing all unique words takes O(m log k). After processing, extract elements from the heap and reverse them to produce the final order. This approach relies on a priority queue to avoid sorting the entire dataset.
Recommended for interviews: Start with the hash map + sorting approach because it demonstrates clear thinking and correct ordering logic. Then mention the min-heap optimization. Interviewers often expect the heap solution when k is small relative to the dataset, since it reduces the ordering cost from O(n log n) to O(n log k) while still using a hash map for counting.
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Hash Map + Sorting | O(n log n) | O(n) | General solution when simplicity and readability matter more than strict optimization |
| Min-Heap (Priority Queue) | O(n log k) | O(n) | Best when k is much smaller than the number of unique words |