Table: Purchases
+---------------+------+ | Column Name | Type | +---------------+------+ | purchase_id | int | | user_id | int | | purchase_date | date | +---------------+------+ purchase_id contains unique values. This table contains logs of the dates that users purchased from a certain retailer.
Write a solution to report the IDs of the users that made any two purchases at most 7 days apart.
Return the result table ordered by user_id.
The result format is in the following example.
Example 1:
Input: Purchases table: +-------------+---------+---------------+ | purchase_id | user_id | purchase_date | +-------------+---------+---------------+ | 4 | 2 | 2022-03-13 | | 1 | 5 | 2022-02-11 | | 3 | 7 | 2022-06-19 | | 6 | 2 | 2022-03-20 | | 5 | 7 | 2022-06-19 | | 2 | 2 | 2022-06-08 | +-------------+---------+---------------+ Output: +---------+ | user_id | +---------+ | 2 | | 7 | +---------+ Explanation: User 2 had two purchases on 2022-03-13 and 2022-03-20. Since the second purchase is within 7 days of the first purchase, we add their ID. User 5 had only 1 purchase. User 7 had two purchases on the same day so we add their ID.
Problem Overview: You are given a Purchases table containing user_id and purchase_date. The task is to return users who made at least two purchases where the difference between the purchase dates is seven days or less. The result should contain unique user IDs that satisfy this condition.
Approach 1: Self Join on Purchases (O(n²) join, O(1) extra space)
The straightforward solution joins the Purchases table with itself on user_id. For each pair of purchases from the same user, compute the date difference using DATEDIFF(p2.purchase_date, p1.purchase_date). Keep rows where the difference is between 0 and 7 days and ensure the second purchase occurs after the first to avoid duplicate pairs. Finally return DISTINCT user_id. This approach directly models the condition but performs a quadratic comparison for users with many purchases.
This technique is common in SQL interview problems where relationships must be checked between rows of the same table. It works well for small datasets or when window functions are unavailable.
Approach 2: Window Function with LAG (O(n log n) time, O(n) space)
A more scalable solution uses a window function. First sort each user's purchases by purchase_date. Then use LAG(purchase_date) partitioned by user_id to access the previous purchase for that user. Compute DATEDIFF(current_date, previous_date) and filter rows where the difference is ≤ 7. If such a row exists, the user qualifies.
The key insight: purchases only need to be compared with the immediately previous purchase once the data is ordered. If two purchases within seven days exist anywhere in the sequence, they will appear as adjacent rows after sorting. This eliminates the expensive pairwise comparison from the self join.
This approach relies on window functions and ordered partitions, which are standard tools for analytical queries in modern database systems. The database performs a partition sort per user, giving roughly O(n log n) time complexity.
Recommended for interviews: The window function solution is typically expected. It shows you understand ordered partitions and row-wise comparisons without exploding the join size. Mentioning the self join approach first demonstrates that you understand the brute-force relational comparison, while the LAG-based query shows practical SQL optimization skills.
MySQL
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Self Join on Purchases | O(n²) worst-case | O(1) | Simple SQL environments or when window functions are unavailable |
| Window Function (LAG) | O(n log n) | O(n) | Preferred approach for large datasets and modern SQL databases |
LeetCode Medium 2228 Amazon “Users With 2 Purchases Within 7 Day" Interview SQL Question Explanation • Everyday Data Science • 1,412 views views
Practice Users With Two Purchases Within Seven Days with our built-in code editor and test cases.
Practice on FleetCode