Table: Users
+-------------+----------+ | Column Name | Type | +-------------+----------+ | user_id | int | | item | varchar | | created_at | datetime | | amount | int | +-------------+----------+ This table may contain duplicate records. Each row includes the user ID, the purchased item, the date of purchase, and the purchase amount.
Write a solution to identify active users. An active user is a user that has made a second purchase within 7 days of any other of their purchases.
For example, if the ending date is May 31, 2023. So any date between May 31, 2023, and June 7, 2023 (inclusive) would be considered "within 7 days" of May 31, 2023.
Return a list of user_id which denotes the list of active users in any order.
The result format is in the following example.
Example 1:
Input: Users table: +---------+-------------------+------------+--------+ | user_id | item | created_at | amount | +---------+-------------------+------------+--------+ | 5 | Smart Crock Pot | 2021-09-18 | 698882 | | 6 | Smart Lock | 2021-09-14 | 11487 | | 6 | Smart Thermostat | 2021-09-10 | 674762 | | 8 | Smart Light Strip | 2021-09-29 | 630773 | | 4 | Smart Cat Feeder | 2021-09-02 | 693545 | | 4 | Smart Bed | 2021-09-13 | 170249 | +---------+-------------------+------------+--------+ Output: +---------+ | user_id | +---------+ | 6 | +---------+ Explanation: - User with user_id 5 has only one transaction, so he is not an active user. - User with user_id 6 has two transaction his first transaction was on 2021-09-10 and second transation was on 2021-09-14. The distance between the first and second transactions date is <= 7 days. So he is an active user. - User with user_id 8 has only one transaction, so he is not an active user. - User with user_id 4 has two transaction his first transaction was on 2021-09-02 and second transation was on 2021-09-13. The distance between the first and second transactions date is > 7 days. So he is not an active user.
Problem Overview: The task is to identify users who were active for multiple consecutive days based on activity records stored in a database table. Each row represents a user action on a specific date. Your goal is to detect users whose activity spans at least five consecutive days and return their IDs.
Approach 1: Self Join Consecutive Dates (O(n^2) time, O(1) extra space)
A straightforward method checks whether a user has activity on consecutive dates by repeatedly joining the table to itself with date offsets. For example, join rows where date = date + INTERVAL 1 DAY, +2 DAY, and so on until five consecutive days are verified. This approach works but becomes inefficient because the database must repeatedly scan and match rows for each user. It demonstrates the core idea of verifying consecutive activity but scales poorly for large datasets.
Approach 2: Window Function + Consecutive Grouping (O(n log n) time, O(n) space)
The efficient solution uses SQL window functions. First, partition records by user_id and order them by activity_date. Assign a row number using ROW_NUMBER(). The key trick is subtracting this row number from the date value to create a stable grouping key for consecutive days. When dates are consecutive, the difference between the date and the row index stays constant. Group by this calculated key and count rows within each group. If a group contains at least five records, that user has five consecutive active days.
This pattern is common in SQL problems involving consecutive sequences. Window functions allow you to scan the table once, track ordering, and detect runs of continuous values without repeated joins. Modern relational databases optimize these operations well, making the solution efficient for large datasets.
Recommended for interviews: The window function approach is what interviewers expect. The self‑join approach shows you understand how to check consecutive dates, but it is not scalable. Using ROW_NUMBER() with partitioning demonstrates strong SQL fundamentals and familiarity with sequence detection patterns in database problems and advanced SQL queries, especially when working with window functions.
MySQL
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Self Join on Consecutive Dates | O(n^2) | O(1) | Useful for understanding consecutive date checks when window functions are unavailable |
| Window Function with ROW_NUMBER Grouping | O(n log n) | O(n) | Best general solution in modern SQL databases; efficient for detecting consecutive sequences |
Leetcode MEDIUM 2688 - Find Active Users - LEAD Window Func SQL Explained by Everyday Data Science • Everyday Data Science • 505 views views
Practice Find Active Users with our built-in code editor and test cases.
Practice on FleetCodePractice this problem
Open in Editor