Count Occurrences in Text - Solution & Explanation

MediumPremiumFree on FleetCodeDatabase4 min read

Problem Statement

Table: Files

+-------------+---------+
| Column Name | Type    |
+-- ----------+---------+
| file_name   | varchar |
| content     | text    |
+-------------+---------+
file_name is the column with unique values of this table. 
Each row contains file_name and the content of that file.

Write a solution to find the number of files that have at least one occurrence of the words 'bull' and 'bear' as a standalone word, respectively, disregarding any instances where it appears without space on either side (e.g. 'bullet', 'bears', 'bull.', or 'bear' at the beginning or end of a sentence will not be considered)

Return the word 'bull' and 'bear' along with the corresponding number of occurrences in any order.

The result format is in the following example.

Example 1:

Input: 
Files table:
+------------+----------------------------------------------------------------------------------+
| file_name  | content                                                                         | 
+------------+----------------------------------------------------------------------------------+
| draft1.txt | The stock exchange predicts a bull market which would make many investors happy. | 
| draft2.txt | The stock exchange predicts a bull market which would make many investors happy, |
|            | but analysts warn of possibility of too much optimism and that in fact we are    |
|            | awaiting a bear market.                                                          | 
| draft3.txt | The stock exchange predicts a bull market which would make many investors happy, |
|            | but analysts warn of possibility of too much optimism and that in fact we are    |
|            | awaiting a bear market. As always predicting the future market is an uncertain   |
|            | game and all investors should follow their instincts and best practices.         | 
+------------+----------------------------------------------------------------------------------+
Output: 
+------+-------+
| word | count |  
+------+-------+
| bull | 3     | 
| bear | 2     | 
+------+-------+
Explanation: 
- The word "bull" appears 1 time in "draft1.txt", 1 time in "draft2.txt", and 1 time in "draft3.txt". Therefore, the total number of occurrences for the word "bull" is 3.
- The word "bear" appears 1 time in "draft2.txt", and 1 time in "draft3.txt". Therefore, the total number of occurrences for the word "bear" is 2.

Approach Overview

Problem Overview: You have a table containing text content. The goal is to count how many times specific keywords (for example bull and bear) appear across all rows. Each occurrence inside the text must be counted, even if multiple matches appear in the same string.

Approach 1: LENGTH and REPLACE Trick (O(n) time, O(1) space)

The common SQL technique for counting substring occurrences uses the difference in string length before and after removing the target word. When you call REPLACE(content, 'bull', ''), every occurrence of the word is removed from the string. The difference between LENGTH(content) and the replaced string tells you how many characters were removed. Divide that difference by the length of the word to get the number of occurrences.

Apply this formula to each row and aggregate the result using SUM(). For example: (LENGTH(content) - LENGTH(REPLACE(content,'bull',''))) / 4 counts occurrences of the word bull. Repeat the same logic for other keywords like bear. Since SQL scans each text once per expression, the runtime is linear with respect to the total text length.

This method works well because it avoids complex parsing or tokenization. SQL engines optimize built-in string functions efficiently, making this approach concise and performant for moderate datasets. It also works consistently across many relational systems discussed in database problems that involve string processing and aggregation queries.

Approach 2: REGEXP-based Counting (O(n) time, O(1) space)

If your SQL engine supports regex counting functions such as REGEXP_COUNT(), you can directly count pattern matches inside the text. This approach expresses the intent more clearly: REGEXP_COUNT(content, 'bull'). You then aggregate counts across all rows. The database internally scans the string using its regex engine, producing the same linear-time behavior.

Regex-based solutions are easier to read but may not be available in every MySQL version used in interview platforms. The LENGTH - REPLACE trick remains the most portable technique for substring frequency queries.

Recommended for interviews: Use the LENGTH - REPLACE method. It shows that you understand SQL string manipulation and aggregation. Regex functions are cleaner when available, but interviewers typically expect the portable approach used in many SQL interview questions.

Solution

Code

MySQL

Try this approach in the editor →

Detailed Complexity Analysis

Approach	Time	Space	When to Use
LENGTH - REPLACE String Difference	O(n)	O(1)	Most common SQL interview technique; works across MySQL versions without regex support
REGEXP_COUNT Pattern Matching	O(n)	O(1)	Cleaner syntax when the SQL engine supports regex counting functions

Video Solution

Leetcode MEDIUM 2738 - Count Occurrences in Text - Using SUM to Count in SQL | Everyday Data Science • Everyday Data Science • 652 views views

Frequently Asked Questions

Is Count Occurrences in Text easy or hard?

The problem is rated Medium because it requires knowledge of SQL string manipulation tricks rather than straightforward aggregation. Once you know the LENGTH minus REPLACE technique, the query becomes short and efficient.

Count Occurrences in Text Python/Java solution

The LeetCode problem is categorized under Database, so the solution is written in SQL rather than Python or Java. The logic relies on SQL string functions such as LENGTH, REPLACE, and optionally REGEXP_COUNT to compute word frequency.

How to solve Count Occurrences in Text in O(n)?

Use the formula (LENGTH(text) - LENGTH(REPLACE(text, 'word', ''))) / LENGTH('word'). This computes how many characters were removed when the word was deleted, which directly translates to the number of occurrences. Aggregate the counts with SUM across rows.

What is the best approach for Count Occurrences in Text?

The most reliable approach uses the SQL LENGTH and REPLACE trick. Compute the difference between the original string length and the string after removing the target word, then divide by the word length. This counts how many times the substring appears and works across most MySQL versions.

Is Count Occurrences in Text asked at Google/Amazon/Meta?

Database string-processing questions appear frequently in SQL interviews at companies like Amazon, Google, and Meta. Variants of substring counting, log parsing, and text aggregation are commonly used to test SQL function knowledge.

What data structure is used in Count Occurrences in Text?

This problem does not rely on traditional data structures like arrays or hash maps. Instead, it uses SQL string manipulation functions and aggregation operations inside a relational database query.

What is the time complexity of Count Occurrences in Text?

The time complexity is O(n), where n is the total length of the text being processed. SQL string functions like LENGTH and REPLACE scan the string once, so counting occurrences scales linearly with the size of the content.

Ready to solve this problem?

Practice Count Occurrences in Text with our built-in code editor and test cases.

Practice on FleetCode

Combine Two Tables

Second Highest Salary

Problem Info

DifficultyMedium

Acceptance55.6%

Approaches1

Reading time4 min

Practice this problem

Open in Editor

Count Occurrences in Text - Solution & Explanation

Problem Statement

Approach Overview

Solution

Code

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Ready to solve this problem?

Problem Info

Table of Contents

Count Occurrences in Text - Solution & Explanation

Problem Statement

Approach Overview

Solution

Code

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Ready to solve this problem?

Problem Info

Table of Contents

Problem Statement

Approach Overview

Solution

Code

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Related Problems

Ready to solve this problem?

Problem Info

Table of Contents

Problem Statement

Approach Overview

Solution

Code

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Related Problems

Ready to solve this problem?

Problem Info

Table of Contents