Given an array of strings words and an integer k, return the k most frequent strings.
Return the answer sorted by the frequency from highest to lowest. Sort the words with the same frequency by their lexicographical order.
Example 1:
Input: words = ["i","love","leetcode","i","love","coding"], k = 2 Output: ["i","love"] Explanation: "i" and "love" are the two most frequent words. Note that "i" comes before "love" due to a lower alphabetical order.
Example 2:
Input: words = ["the","day","is","sunny","the","the","the","sunny","is","is"], k = 4 Output: ["the","is","sunny","day"] Explanation: "the", "is", "sunny" and "day" are the four most frequent words, with the number of occurrence being 4, 3, 2 and 1 respectively.
Constraints:
1 <= words.length <= 5001 <= words[i].length <= 10words[i] consists of lowercase English letters.k is in the range [1, The number of unique words[i]]Follow-up: Could you solve it in O(n log(k)) time and O(n) extra space?
To solve #692 Top K Frequent Words, the key idea is to first count how often each word appears and then efficiently retrieve the k most frequent ones. A common strategy is to use a hash table to store word frequencies, followed by a priority queue (heap) that orders words by frequency and lexicographical order when frequencies tie.
A min-heap of size k is often preferred because it keeps only the top candidates while iterating through the frequency map. The heap comparator typically prioritizes lower frequency first, and when frequencies match, the word with higher lexicographical order is removed earlier. After processing all words, the heap contains the desired results.
Another alternative is bucket sort, where words are grouped by frequency and then sorted lexicographically within each bucket. This can reduce heap operations but may require additional sorting. The heap-based method usually runs in O(n log k) time with O(n) space.
| Approach | Time Complexity | Space Complexity |
|---|---|---|
| Hash Map + Min Heap | O(n log k) | O(n) |
| Bucket Sort + Lexicographical Sort | O(n log n) | O(n) |
NeetCode
Explanation:
To solve this problem, we can use a hash map (or dictionary) to count the frequency of each word in the given array. Once we have the frequencies, we can then create a list of words and sort them based on their frequency in descending order. An additional condition for sorting is that if two words have the same frequency, we sort them alphabetically.
This approach simplifies the solution since sorting (bound by the number of unique words) helps easily extract the top k frequent words.
Time Complexity: O(n log n), where n is the number of unique words. This arises from the sorting operation.
Space Complexity: O(n), used for storing unique words.
1from collections import Counter
2
3def topKFrequent(words, k):
4 count = Counter(words)
5 candidates = list(Here, Counter from the collections module is utilized to simplify frequency counting. We sort the words using a lambda function to prioritize frequency and lexicographical order. Finally, the top k words are selected as the result.
Explanation:
This method takes advantage of a min-heap to efficiently determine the top k frequent words. Instead of sorting the entire set of words, which incurs an O(n log n) cost, we can maintain a min-heap of size k. As we iterate through the frequency map, we push elements into the heap. If heap size exceeds k, we remove the smallest element (i.e., least frequent word or alphabetically greater if frequencies are tied).
This is guided by the Follow-up question, showcasing an alternative solution bounded by O(n log k) time complexity.
Time Complexity: O(n log k), optimizing by constraining the heap's growth beyond k elements.
Space Complexity: O(n + k), combining map and heap usage.
1#include <vector>
2#include <string>
#include <unordered_map>
#include <queue>
class Solution {
public:
struct Comp {
bool operator()(const std::pair<int, std::string>& a, const std::pair<int, std::string>& b) const {
if (a.first == b.first)
return a.second < b.second;
return a.first > b.first;
}
};
std::vector<std::string> topKFrequent(std::vector<std::string>& words, int k) {
std::unordered_map<std::string, int> count;
for (const std::string& word : words) {
count[word]++;
}
std::priority_queue<std::pair<int, std::string>, std::vector<std::pair<int, std::string>>, Comp> minHeap;
for (const auto& [word, freq] : count) {
minHeap.emplace(freq, word);
if (minHeap.size() > k) {
minHeap.pop();
}
}
std::vector<std::string> result;
while (!minHeap.empty()) {
result.push_back(minHeap.top().second);
minHeap.pop();
}
std::reverse(result.begin(), result.end());
return result;
}
};
Watch expert explanations and walkthroughs
Practice problems asked by these companies to ace your technical interviews.
Explore More ProblemsJot down your thoughts, approach, and key learnings
Yes, variations of this problem are commonly asked in FAANG and other top tech interviews. It tests knowledge of hash maps, heaps, sorting logic, and handling custom comparators—skills frequently required in coding interviews.
The optimal approach typically uses a hash map to count word frequencies and a min-heap of size k to keep track of the top k frequent words. The heap maintains ordering by frequency and lexicographical order for ties. This results in an efficient O(n log k) time complexity.
A combination of a hash table and a priority queue (min-heap) works best. The hash table stores frequency counts, while the heap efficiently maintains the top k elements based on frequency and alphabetical order.
When two words have the same frequency, the problem requires returning the word with the smaller lexicographical order first. This means the comparison logic in sorting or heap operations must consider both frequency and alphabetical order.
In this C++ example, we create a custom comparator to maintain a priority queue (min-heap) based on the frequency and lexicon criteria. By limiting the heap to size k, we efficiently manage only the most critical elements, producing a final list once extraneous words are removed from the heap.