Table: Views
+---------------+---------+ | Column Name | Type | +---------------+---------+ | article_id | int | | author_id | int | | viewer_id | int | | view_date | date | +---------------+---------+ This table may have duplicate rows. Each row of this table indicates that some viewer viewed an article (written by some author) on some date. Note that equal author_id and viewer_id indicate the same person.
Write a solution to find all the people who viewed more than one article on the same date.
Return the result table sorted by id in ascending order.
The result format is in the following example.
Example 1:
Input: Views table: +------------+-----------+-----------+------------+ | article_id | author_id | viewer_id | view_date | +------------+-----------+-----------+------------+ | 1 | 3 | 5 | 2019-08-01 | | 3 | 4 | 5 | 2019-08-01 | | 1 | 3 | 6 | 2019-08-02 | | 2 | 7 | 7 | 2019-08-01 | | 2 | 7 | 6 | 2019-08-02 | | 4 | 7 | 1 | 2019-07-22 | | 3 | 4 | 4 | 2019-07-21 | | 3 | 4 | 4 | 2019-07-21 | +------------+-----------+-----------+------------+ Output: +------+ | id | +------+ | 5 | | 6 | +------+
Problem Overview: The Views table records which viewer opened which article on a specific date. The task is to return all viewer_id values where the same viewer viewed more than one distinct article on the same day. The result should contain unique viewer IDs sorted in ascending order.
Approach 1: Self Join on Same Day (O(n^2) time, O(1) space)
A straightforward way is joining the Views table with itself. Match rows where v1.viewer_id = v2.viewer_id and v1.view_date = v2.view_date, but ensure v1.article_id != v2.article_id. This identifies cases where a viewer opened two different articles on the same day. After the join, select distinct viewer_id. This method works but performs a pairwise comparison between rows for the same viewer and date, which can grow expensive as the dataset increases.
Approach 2: GROUP BY with HAVING (O(n) time, O(k) space)
The efficient approach groups rows by viewer_id and view_date. For each group, count the number of distinct articles using COUNT(DISTINCT article_id). If the count is greater than one, the viewer read multiple articles that day. After identifying such groups, select distinct viewer_id and order the result. This method scans the table once and relies on SQL aggregation, which is optimized in most database engines.
The key insight is recognizing that the condition "viewed multiple articles on the same day" maps directly to an aggregation constraint. Instead of comparing rows pairwise, you aggregate rows by viewer and date and apply a HAVING filter. SQL engines handle grouping efficiently, making this the preferred solution.
This problem is a classic example of using aggregation in database queries. Understanding GROUP BY, COUNT(DISTINCT ...), and filtering with HAVING is essential for many interview-style SQL tasks. The technique also appears in analytics queries where you detect repeated activity within a time window, which is a common pattern in data analysis.
Recommended for interviews: The GROUP BY + HAVING solution is what interviewers expect. The self-join approach shows you understand relational comparisons, but the aggregation approach demonstrates stronger SQL fluency and better scalability.
MySQL
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Self Join on Same Day | O(n^2) | O(1) | Conceptual approach when demonstrating relational joins between rows |
| GROUP BY with HAVING and COUNT(DISTINCT) | O(n) | O(k) | Best general solution for SQL interview questions involving repeated activity detection |
AMAZON & LINKEDIN LeetCode Medium “Article Views II" 1149 Interview SQL Question Explanation | EDS • Everyday Data Science • 1,518 views views
Watch 1 more video solutions →Practice Article Views II with our built-in code editor and test cases.
Practice on FleetCode