Write a bash script to calculate the frequency of each word in a text file words.txt.
For simplicity sake, you may assume:
words.txt contains only lowercase characters and space ' ' characters.Example:
Assume that words.txt has the following content:
the day is sunny the the the sunny is is
Your script should output the following, sorted by descending frequency:
the 4 is 3 sunny 2 day 1
Note:
Problem Overview: The task reads a text file words.txt and prints each distinct word with its frequency, sorted by highest frequency first. Words are space separated, and the output format is word count per line.
Approach 1: Command-line Utilities Pipeline (O(n log n) time, O(n) space)
This approach uses standard Unix text processing tools to transform and aggregate the input. First normalize the input by converting spaces to newlines so each word appears on its own line. Then sort the words so identical values become adjacent. The uniq -c command counts consecutive duplicates, and a final sort orders results by frequency in descending order. This method relies on the efficiency of Unix tools and is the most idiomatic solution for Shell problems.
Approach 2: Using grep, sort, and uniq (O(n log n) time, O(n) space)
Another shell-focused approach extracts words using grep -o, which outputs every matched word on its own line. After extraction, the workflow mirrors the previous pipeline: sort the words, count occurrences with uniq -c, and then sort the counts in descending order. This method is useful when input formatting is inconsistent because grep provides stronger pattern matching than simple character translation.
Approach 3: Recursive Approach with Memoization (O(n) time, O(k) space)
A language-based solution treats the problem as a frequency counting task using a hash map. Recursively process the list of words, updating a memoization map where the key is the word and the value is its count. Each recursive call consumes one word and increments the stored frequency. After processing, convert the map into a list and sort by frequency. This approach demonstrates the classic hash map counting pattern often used in string processing problems.
Approach 4: Iterative Dynamic Programming / Frequency Map (O(n) time, O(k) space)
The iterative version is the standard approach in most programming languages. Iterate through the array of words and maintain a frequency dictionary. Each step performs a constant-time hash lookup and increment. Once all counts are computed, sort the entries by frequency in descending order using a comparator. The counting step is linear, while sorting costs O(k log k) where k is the number of unique words. This pattern appears frequently in problems involving sorting combined with frequency analysis.
Recommended for interviews: Interviewers expect the hash map counting approach because it shows you understand frequency tables and efficient lookups. The shell pipeline solution is the intended answer for this specific problem on LeetCode since it tests familiarity with Unix tools. Knowing both demonstrates practical engineering skill: hash maps for algorithmic interviews and command-line pipelines for real-world text processing.
Solutions for this problem are being prepared.
Try solving it yourself| Approach | Time | Space | When to Use |
|---|---|---|---|
| Command-line Utilities Pipeline | O(n log n) | O(n) | Best for shell scripting and large text processing tasks |
| grep + sort + uniq | O(n log n) | O(n) | When input needs regex-based extraction before counting |
| Recursive Memoization (Hash Map) | O(n) | O(k) | Educational approach demonstrating recursion with frequency maps |
| Iterative Frequency Map | O(n) + O(k log k) | O(k) | General-purpose solution in Python, Java, C++, or JavaScript |
I HATE This Coding Question, but FAANG Loves it! | Majority Element - Leetcode 169 • Greg Hogg • 2,093,870 views views
Watch 9 more video solutions →Practice Word Frequency with our built-in code editor and test cases.
Practice on FleetCode