Table: Activity
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| user_id | int |
| session_id | int |
| activity_date | date |
| activity_type | enum |
+---------------+---------+
This table may have duplicate rows.
The activity_type column is an ENUM (category) of type ('open_session', 'end_session', 'scroll_down', 'send_message').
The table shows the user activities for a social media website.
Note that each session belongs to exactly one user.
Write a solution to find the daily active user count for a period of 30 days ending 2019-07-27 inclusively. A user was active on someday if they made at least one activity on that day.
Return the result table in any order.
The result format is in the following example.
Example 1:
Input: Activity table: +---------+------------+---------------+---------------+ | user_id | session_id | activity_date | activity_type | +---------+------------+---------------+---------------+ | 1 | 1 | 2019-07-20 | open_session | | 1 | 1 | 2019-07-20 | scroll_down | | 1 | 1 | 2019-07-20 | end_session | | 2 | 4 | 2019-07-20 | open_session | | 2 | 4 | 2019-07-21 | send_message | | 2 | 4 | 2019-07-21 | end_session | | 3 | 2 | 2019-07-21 | open_session | | 3 | 2 | 2019-07-21 | send_message | | 3 | 2 | 2019-07-21 | end_session | | 4 | 3 | 2019-06-25 | open_session | | 4 | 3 | 2019-06-25 | end_session | +---------+------------+---------------+---------------+ Output: +------------+--------------+ | day | active_users | +------------+--------------+ | 2019-07-20 | 2 | | 2019-07-21 | 2 | +------------+--------------+ Explanation: Note that we do not care about days with zero active users.
Problem Overview: The table Activity stores user sessions with an activity_date. You need to return the number of distinct users active on each day within the last 30 days ending on 2019-07-27. The result groups records by day and counts unique users for that date.
Approach 1: SQL Aggregation with Date Filtering (O(n) time, O(k) space)
This is the natural solution when working with relational data. Filter rows where activity_date falls within the 30‑day window using a date range condition such as BETWEEN DATE_SUB('2019-07-27', INTERVAL 29 DAY) AND '2019-07-27'. After filtering, group the remaining rows by activity_date and compute COUNT(DISTINCT user_id) to measure daily active users. The database engine scans the relevant rows once and performs a grouped aggregation. Time complexity is O(n) where n is the number of rows in the table, and space complexity is O(k) for storing grouped results where k is the number of distinct days in the window.
This approach relies on core SQL operations: filtering, grouping, and distinct counting. It is concise, efficient, and exactly how production analytics queries are written on top of a database.
Approach 2: Data Processing with Scripting (O(n) time, O(k) space)
If the dataset is processed outside the database (for example in Python or JavaScript), you can iterate through all activity records and build a map keyed by activity_date. For each record within the 30‑day range, insert the user_id into a set associated with that date. Sets guarantee uniqueness, so duplicate sessions from the same user on the same day are ignored automatically. After processing all rows, compute the size of each set to produce the active user count per day.
This method uses a dictionary of sets (Map<date, Set<user_id>>) and is common in general-purpose data pipelines. The iteration step runs in O(n) time, while memory usage grows with the number of distinct days and users stored in the sets. The underlying idea is similar to problems involving a hash map for grouping.
Recommended for interviews: Interviewers expect the SQL aggregation approach. It demonstrates that you understand filtering a time window, grouping rows, and counting distinct values efficiently. The scripting approach is useful when the data is already loaded in memory or when solving the problem in languages like Python during data-processing interviews.
This approach uses SQL aggregation functions to filter activities in the last 30 days, group the activities by date, and count distinct user IDs for each date. We'll make use of the DATE_SUB function to get the starting date and apply GROUP BY and COUNT DISTINCT functions to get the active user count per day.
This SQL statement selects dates and counts distinct users who were active on those dates. It filters the dates to the last 30 days ending on 2019-07-27 using the BETWEEN clause and DATE_SUB function. The GROUP BY activity_date groups the results by each day, and COUNT(DISTINCT user_id) counts the number of unique users active on each of those days.
SQL
Time Complexity is approximately O(n) where n is the number of records in the table, because SQL has to scan all entries to filter and group them. Space Complexity is O(k) where k is the number of unique days with activities in the range, as that's the size of the result set.
This approach involves using a script to fetch and process data entries to count unique users per day. The script aggregates the data manually by iterating over each record, filtering by date, and maintaining a set of unique users for each activity date.
This Python code uses a defaultdict to map each activity_date to a set of unique user_ids, accumulating only those records where the date falls between 2019-06-28 and 2019-07-27. The use of sets ensures all user_ids are distinct for each day. Finally, it returns a list of tuples showing each day and the count of its active users.
Python
JavaScript
Time Complexity is O(n), where n is the number of activity records, as each record is processed separately. Space Complexity is O(d + u) where d is the number of days in the result and u is the total number of unique users aggregated in the set for all days.
MySQL
| Approach | Complexity |
|---|---|
| SQL Aggregation with Date Filtering | Time Complexity is approximately O(n) where n is the number of records in the table, because SQL has to scan all entries to filter and group them. Space Complexity is O(k) where k is the number of unique days with activities in the range, as that's the size of the result set. |
| Data Processing with Scripting | Time Complexity is O(n), where n is the number of activity records, as each record is processed separately. Space Complexity is O(d + u) where d is the number of days in the result and u is the total number of unique users aggregated in the set for all days. |
| Default Approach | — |
| Approach | Time | Space | When to Use |
|---|---|---|---|
| SQL Aggregation with Date Filtering | O(n) | O(k) | Best for SQL interviews and production database queries |
| Scripting with Hash Map and Sets | O(n) | O(k + u) | Useful when processing exported data in Python or JavaScript |
User Activity for the Past 30 Days I | Leetcode 1141 | Crack SQL Interviews in 50 Qs #mysql • Learn With Chirag • 10,814 views views
Watch 9 more video solutions →Practice User Activity for the Past 30 Days I with our built-in code editor and test cases.
Practice on FleetCode