Table: Activity
+--------------+---------+ | Column Name | Type | +--------------+---------+ | player_id | int | | device_id | int | | event_date | date | | games_played | int | +--------------+---------+ (player_id, event_date) is the primary key (combination of columns with unique values) of this table. This table shows the activity of players of some games. Each row is a record of a player who logged in and played a number of games (possibly 0) before logging out on someday using some device.
The install date of a player is the first login day of that player.
We define day one retention of some date x to be the number of players whose install date is x and they logged back in on the day right after x, divided by the number of players whose install date is x, rounded to 2 decimal places.
Write a solution to report for each install date, the number of players that installed the game on that day, and the day one retention.
Return the result table in any order.
The result format is in the following example.
Example 1:
Input: Activity table: +-----------+-----------+------------+--------------+ | player_id | device_id | event_date | games_played | +-----------+-----------+------------+--------------+ | 1 | 2 | 2016-03-01 | 5 | | 1 | 2 | 2016-03-02 | 6 | | 2 | 3 | 2017-06-25 | 1 | | 3 | 1 | 2016-03-01 | 0 | | 3 | 4 | 2016-07-03 | 5 | +-----------+-----------+------------+--------------+ Output: +------------+----------+----------------+ | install_dt | installs | Day1_retention | +------------+----------+----------------+ | 2016-03-01 | 2 | 0.50 | | 2017-06-25 | 1 | 0.00 | +------------+----------+----------------+ Explanation: Player 1 and 3 installed the game on 2016-03-01 but only player 1 logged back in on 2016-03-02 so the day 1 retention of 2016-03-01 is 1 / 2 = 0.50 Player 2 installed the game on 2017-06-25 but didn't log back in on 2017-06-26 so the day 1 retention of 2017-06-25 is 0 / 1 = 0.00
Problem Overview: The table Activity stores player login activity with columns player_id, device_id, event_date, and games_played. The goal is to compute the fraction of players who return exactly one day after their first login. The result is grouped by each player's install date (their first recorded login).
Approach 1: Subquery + Self Join (O(n) time, O(n) space)
Start by finding each player's install date using MIN(event_date) grouped by player_id. This produces a derived table where every player has a single install date. Next, join this result back to the Activity table and check whether the player has another record where event_date = install_date + INTERVAL 1 DAY. Counting these matches gives the number of players who returned the next day. Finally, group by install_date and divide the number of returning players by the total number of installs for that date. The database performs a linear scan with aggregation, giving O(n) time complexity and O(n) intermediate storage for grouped results.
This approach relies on core database operations: grouping, joins, and date arithmetic. The key insight is separating the "first login" computation from the "next-day return" check. Once the install date is known, verifying retention becomes a simple equality condition.
Approach 2: Window Function Strategy (O(n) time, O(n) space)
Another solution uses window functions available in modern SQL engines. Use MIN(event_date) OVER (PARTITION BY player_id) to compute each player's install date directly inside the query without a separate aggregation table. Then filter rows where event_date = install_date + INTERVAL 1 DAY to identify returning players. Aggregate results by install date and compute the retention fraction using COUNT(DISTINCT player_id). Window functions simplify query structure and often reduce nesting while still scanning the dataset once.
This approach highlights advanced SQL features and is useful when multiple per-player metrics must be computed simultaneously. Engines optimize window partitions efficiently, keeping complexity at O(n).
Recommended for interviews: The subquery + join approach is typically expected. It clearly separates install-date computation from retention analysis and demonstrates strong fundamentals in database aggregation and join logic. Window functions are more concise but depend on familiarity with analytic SQL features.
MySQL
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Subquery + Self Join | O(n) | O(n) | Most common SQL solution; clear install-date calculation and retention check |
| Window Function Strategy | O(n) | O(n) | Cleaner query when window functions are supported |
| Naive Multi-Query Aggregation | O(n log n) | O(n) | Conceptual approach for understanding retention before optimizing joins |
LeetCode 1097: Game Play Analysis V (Hard) [SQL] • Frederik Müller • 3,348 views views
Watch 1 more video solutions →Practice Game Play Analysis V with our built-in code editor and test cases.
Practice on FleetCode