Table: Keywords
+-------------+---------+ | Column Name | Type | +-------------+---------+ | topic_id | int | | word | varchar | +-------------+---------+ (topic_id, word) is the primary key (combination of columns with unique values) for this table. Each row of this table contains the id of a topic and a word that is used to express this topic. There may be more than one word to express the same topic and one word may be used to express multiple topics.
Table: Posts
+-------------+---------+ | Column Name | Type | +-------------+---------+ | post_id | int | | content | varchar | +-------------+---------+ post_id is the primary key (column with unique values) for this table. Each row of this table contains the ID of a post and its content. Content will consist only of English letters and spaces.
Leetcode has collected some posts from its social media website and is interested in finding the topics of each post. Each topic can be expressed by one or more keywords. If a keyword of a certain topic exists in the content of a post (case insensitive) then the post has this topic.
Write a solution to find the topics of each post according to the following rules:
"Ambiguous!".','. The string should not contain duplicate IDs.Return the result table in any order.
The result format is in the following example.
Example 1:
Input: Keywords table: +----------+----------+ | topic_id | word | +----------+----------+ | 1 | handball | | 1 | football | | 3 | WAR | | 2 | Vaccine | +----------+----------+ Posts table: +---------+------------------------------------------------------------------------+ | post_id | content | +---------+------------------------------------------------------------------------+ | 1 | We call it soccer They call it football hahaha | | 2 | Americans prefer basketball while Europeans love handball and football | | 3 | stop the war and play handball | | 4 | warning I planted some flowers this morning and then got vaccinated | +---------+------------------------------------------------------------------------+ Output: +---------+------------+ | post_id | topic | +---------+------------+ | 1 | 1 | | 2 | 1 | | 3 | 1,3 | | 4 | Ambiguous! | +---------+------------+ Explanation: 1: "We call it soccer They call it football hahaha" "football" expresses topic 1. There is no other word that expresses any other topic. 2: "Americans prefer basketball while Europeans love handball and football" "handball" expresses topic 1. "football" expresses topic 1. There is no other word that expresses any other topic. 3: "stop the war and play handball" "war" expresses topic 3. "handball" expresses topic 1. There is no other word that expresses any other topic. 4: "warning I planted some flowers this morning and then got vaccinated" There is no word in this sentence that expresses any topic. Note that "warning" is different from "war" although they have a common prefix. This post is ambiguous. Note that it is okay to have one word that expresses more than one topic.
Problem Overview: Each post contains free‑text content. A separate Keywords table maps individual keywords to a topic_id. Your task is to detect which topic each post belongs to by checking whether any keyword appears in the post content. If exactly one distinct topic matches, return that topic. If none or multiple topics match, label the result as Ambiguous.
Approach 1: REGEXP Keyword Matching with Aggregation (MySQL)
This solution scans each post and attempts to match every keyword using a SQL REGEXP pattern. The join condition checks whether the keyword appears as a whole word inside Posts.content, typically using a pattern like CONCAT('(^| )', keyword, '( |$)'). After joining, group rows by post_id and compute COUNT(DISTINCT topic_id). If the count equals 1, that topic uniquely identifies the post; otherwise return 'Ambiguous'. Time complexity is roughly O(P × K × L), where P is number of posts, K keywords, and L average content length due to regex scanning. Space complexity is O(1) beyond aggregation.
The key insight is using SQL aggregation to collapse multiple keyword matches into a single decision per post. COUNT(DISTINCT topic_id) detects ambiguity, while MIN(topic_id) or MAX(topic_id) safely returns the only topic when the count equals one. A LEFT JOIN ensures posts with no keyword matches are still included, which naturally falls into the ambiguous case because the topic count is not exactly one.
This pattern is common in text classification tasks implemented directly in SQL. The database handles pattern matching, grouping, and conditional logic in one pass without exporting data to application code. Concepts overlap with database querying, text filtering using string processing, and aggregation patterns similar to hash-based counting used in algorithmic problems.
Recommended for interviews: The regex join plus aggregation approach is the expected solution. It demonstrates strong SQL fundamentals: pattern matching, joins, grouping, and conditional projection. A naive approach that manually parses words outside SQL would be inefficient and misses the point of the problem. Interviewers want to see you translate text matching into relational operations.
MySQL
| Approach | Time | Space | When to Use |
|---|---|---|---|
| REGEXP Join + Aggregation | O(P × K × L) | O(1) | General SQL solution when keywords must be matched inside text content |
LeetCode Hard 2199 Facebook “Finding the Topic of Each Post" Interview SQL Question with Explanation • Everyday Data Science • 2,236 views views
Practice Finding the Topic of Each Post with our built-in code editor and test cases.
Practice on FleetCode