Top K Frequent Words - Solution & Explanation

Q: Is Top K Frequent Words easy or hard?

Top K Frequent Words is considered a medium difficulty problem. The counting step is straightforward, but correctly handling lexicographic tie-breaking and designing an efficient heap comparator requires careful implementation.

Q: Top K Frequent Words Python/Java solution

In Python, developers commonly use collections.Counter with heapq or sorted with a custom key. In Java, a HashMap is used for counting and a PriorityQueue with a custom comparator maintains the top k words. Both implementations follow the same O(n log k) or O(n log n) strategy.

Q: How to solve Top K Frequent Words in O(n)?

A pure O(n) solution is difficult because results must be ordered by frequency and lexicographic rules. However, counting frequencies using a hash map is O(n). The remaining ordering step typically requires sorting or a heap, leading to O(n log n) or O(n log k) complexity in practice.

Q: What is the best approach for Top K Frequent Words?

The most common interview solution uses a hash map to count word frequencies and a min-heap of size k to track the top elements. This approach runs in O(n log k) time and avoids sorting the entire list of unique words. When k is small relative to the input size, it performs significantly better than full sorting.

Q: Is Top K Frequent Words asked at Google/Amazon/Meta?

Top K style problems appear frequently in interviews at companies like Amazon, Google, and Meta. Variants such as Top K Frequent Elements or Top K Words test understanding of hash maps, heaps, and custom sorting logic.

Q: What data structure is used in Top K Frequent Words?

The core data structures are a hash table for frequency counting and either a sorting mechanism or a priority queue (min-heap) to determine the top k results. These structures allow efficient counting and ordering of words based on frequency and lexicographic rules.

Q: What is the time complexity of Top K Frequent Words?

The hash map plus sorting approach runs in O(n log n) time because all unique words are sorted by frequency and lexicographic order. The optimized heap approach runs in O(n log k) time since only k elements are maintained in the priority queue after counting frequencies.

MediumHash Table String Trie Sorting12 min readAsked at: Amazon, Microsoft, Apple +23

Practice this problem

Problem Statement

Given an array of strings words and an integer k, return the k most frequent strings.

Return the answer sorted by the frequency from highest to lowest. Sort the words with the same frequency by their lexicographical order.

Example 1:

Input: words = ["i","love","leetcode","i","love","coding"], k = 2
Output: ["i","love"]
Explanation: "i" and "love" are the two most frequent words.
Note that "i" comes before "love" due to a lower alphabetical order.

Example 2:

Input: words = ["the","day","is","sunny","the","the","the","sunny","is","is"], k = 4
Output: ["the","is","sunny","day"]
Explanation: "the", "is", "sunny" and "day" are the four most frequent words, with the number of occurrence being 4, 3, 2 and 1 respectively.

Constraints:

1 <= words.length <= 500
1 <= words[i].length <= 10
words[i] consists of lowercase English letters.
k is in the range [1, The number of unique words[i]]

Follow-up: Could you solve it in O(n log(k)) time and O(n) extra space?

Approach Overview

Problem Overview: Given an array of words, return the k most frequent words. Words with higher frequency come first, and when two words have the same frequency, the lexicographically smaller word appears earlier. The challenge is efficiently counting frequencies and ordering results by both frequency and alphabetical order.

Approach 1: Hash Map and Sorting (Time: O(n log n), Space: O(n))

The straightforward solution uses a frequency map. First iterate through the input array and count occurrences using a hash table. This step runs in O(n). Next convert the map entries into a list and sort them using a custom comparator: higher frequency first, and if frequencies match, compare words lexicographically.

The sorting step dominates the runtime at O(m log m), where m is the number of unique words (worst case m = n). The comparator ensures correct ordering without additional passes. After sorting, simply take the first k words. This approach is easy to implement and works well when the number of unique words is moderate.

Because the heavy lifting is done by sorting, this method is reliable and readable. Many interview candidates start with this solution because it clearly separates counting and ordering logic.

Approach 2: Min-Heap (Time: O(n log k), Space: O(n))

When k is much smaller than the number of unique words, a heap reduces unnecessary sorting. First build the same frequency map using a hash table in O(n). Then iterate through each unique word and push it into a size-limited min-heap based on frequency.

The heap stores at most k elements. The comparator prioritizes smaller frequency, and when frequencies tie, the lexicographically larger word is considered smaller in the heap so it gets removed first. Whenever the heap size exceeds k, pop the smallest element. This keeps only the top k candidates.

Each heap insertion or removal costs O(log k), so processing all unique words takes O(m log k). After processing, extract elements from the heap and reverse them to produce the final order. This approach relies on a priority queue to avoid sorting the entire dataset.

Recommended for interviews: Start with the hash map + sorting approach because it demonstrates clear thinking and correct ordering logic. Then mention the min-heap optimization. Interviewers often expect the heap solution when k is small relative to the dataset, since it reduces the ordering cost from O(n log n) to O(n log k) while still using a hash map for counting.

Approach 1: Approach 1: Hash Map and Sorting

Explanation:

To solve this problem, we can use a hash map (or dictionary) to count the frequency of each word in the given array. Once we have the frequencies, we can then create a list of words and sort them based on their frequency in descending order. An additional condition for sorting is that if two words have the same frequency, we sort them alphabetically.

This approach simplifies the solution since sorting (bound by the number of unique words) helps easily extract the top k frequent words.

In this C solution, we use an array of Node structs to record words and their counts. We iterate through the input array, updating frequency counts. After processing the input, sorting is done via qsort using a custom comparator that sorts first by frequency, then lexicographically. Finally, we collect the first k elements as the result.

Code

C C++Java Python C#JavaScript

C++

Java

Python

JavaScript

Complexity

Time Complexity: O(n log n), where n is the number of unique words. This arises from the sorting operation.

Space Complexity: O(n), used for storing unique words.

Try this approach in the editor →

Approach 2: Approach 2: Min-Heap

Explanation:

This method takes advantage of a min-heap to efficiently determine the top k frequent words. Instead of sorting the entire set of words, which incurs an O(n log n) cost, we can maintain a min-heap of size k. As we iterate through the frequency map, we push elements into the heap. If heap size exceeds k, we remove the smallest element (i.e., least frequent word or alphabetically greater if frequencies are tied).

This is guided by the Follow-up question, showcasing an alternative solution bounded by O(n log k) time complexity.

In this C++ example, we create a custom comparator to maintain a priority queue (min-heap) based on the frequency and lexicon criteria. By limiting the heap to size k, we efficiently manage only the most critical elements, producing a final list once extraneous words are removed from the heap.

Code

C++Python JavaScript

C++

Python

JavaScript

Complexity

Time Complexity: O(n log k), optimizing by constraining the heap's growth beyond k elements.

Space Complexity: O(n + k), combining map and heap usage.

Try this approach in the editor →

Complexity Comparison

Approach	Complexity
Approach 1: Hash Map and Sorting	Time Complexity: O(n log n), where n is the number of unique words. This arises from the sorting operation. Space Complexity: O(n), used for storing unique words.
Approach 2: Min-Heap	Time Complexity: O(n log k), optimizing by constraining the heap's growth beyond `k` elements. Space Complexity: O(n + k), combining map and heap usage.

Detailed Complexity Analysis

Approach	Time	Space	When to Use
Hash Map + Sorting	O(n log n)	O(n)	General solution when simplicity and readability matter more than strict optimization
Min-Heap (Priority Queue)	O(n log k)	O(n)	Best when k is much smaller than the number of unique words

Video Solution

TOP K FREQUENT WORDS| LEETCODE 692 | PYTHON CUSTOM HEAP SOLUTION • Cracking FAANG • 7,892 views views

Watch 9 more video solutions →

Frequently Asked Questions

Is Top K Frequent Words easy or hard?

Top K Frequent Words is considered a medium difficulty problem. The counting step is straightforward, but correctly handling lexicographic tie-breaking and designing an efficient heap comparator requires careful implementation.

Top K Frequent Words Python/Java solution

In Python, developers commonly use collections.Counter with heapq or sorted with a custom key. In Java, a HashMap is used for counting and a PriorityQueue with a custom comparator maintains the top k words. Both implementations follow the same O(n log k) or O(n log n) strategy.

How to solve Top K Frequent Words in O(n)?

A pure O(n) solution is difficult because results must be ordered by frequency and lexicographic rules. However, counting frequencies using a hash map is O(n). The remaining ordering step typically requires sorting or a heap, leading to O(n log n) or O(n log k) complexity in practice.

What is the best approach for Top K Frequent Words?

The most common interview solution uses a hash map to count word frequencies and a min-heap of size k to track the top elements. This approach runs in O(n log k) time and avoids sorting the entire list of unique words. When k is small relative to the input size, it performs significantly better than full sorting.

Is Top K Frequent Words asked at Google/Amazon/Meta?

Top K style problems appear frequently in interviews at companies like Amazon, Google, and Meta. Variants such as Top K Frequent Elements or Top K Words test understanding of hash maps, heaps, and custom sorting logic.

What data structure is used in Top K Frequent Words?

The core data structures are a hash table for frequency counting and either a sorting mechanism or a priority queue (min-heap) to determine the top k results. These structures allow efficient counting and ordering of words based on frequency and lexicographic rules.

What is the time complexity of Top K Frequent Words?

The hash map plus sorting approach runs in O(n log n) time because all unique words are sorted by frequency and lexicographic order. The optimized heap approach runs in O(n log k) time since only k elements are maintained in the priority queue after counting frequencies.

Ready to solve this problem?

Practice Top K Frequent Words with our built-in code editor and test cases.

Practice on FleetCode

K Closest Points to Origin

Maximum Number of Pairs in Array

Problem Info

DifficultyMedium

Acceptance60.0%

Approaches2

Reading time12 min

Asked at

Amazon Microsoft Apple Goldman Sachs Meta

Practice this problem

Open in Editor

Top K Frequent Words - Solution & Explanation

Problem Statement

Approach Overview

Approach 1: Approach 1: Hash Map and Sorting

Code

Complexity

Approach 2: Approach 2: Min-Heap

Code

Complexity

Complexity Comparison

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Ready to solve this problem?

Problem Info

Table of Contents

Top K Frequent Words - Solution & Explanation

Problem Statement

Approach Overview

Approach 1: Approach 1: Hash Map and Sorting

Code

Complexity

Approach 2: Approach 2: Min-Heap

Code

Complexity

Complexity Comparison

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Ready to solve this problem?

Problem Info

Table of Contents

Problem Statement

Approach Overview

Approach 1: Approach 1: Hash Map and Sorting

Code

Complexity

Approach 2: Approach 2: Min-Heap

Code

Complexity

Complexity Comparison

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Related Problems

Ready to solve this problem?

Problem Info

Table of Contents

Problem Statement

Approach Overview

Approach 1: Approach 1: Hash Map and Sorting

Code

Complexity

Approach 2: Approach 2: Min-Heap

Code

Complexity

Complexity Comparison

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Related Problems

Ready to solve this problem?

Problem Info

Table of Contents