Watch 10 video solutions for Game Play Analysis IV, a medium level problem involving Database. This walkthrough by Learn With Chirag has 17,575 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.
Table: Activity
+--------------+---------+ | Column Name | Type | +--------------+---------+ | player_id | int | | device_id | int | | event_date | date | | games_played | int | +--------------+---------+ (player_id, event_date) is the primary key (combination of columns with unique values) of this table. This table shows the activity of players of some games. Each row is a record of a player who logged in and played a number of games (possibly 0) before logging out on someday using some device.
Write a solution to report the fraction of players that logged in again on the day after the day they first logged in, rounded to 2 decimal places. In other words, you need to count the number of players that logged in for at least two consecutive days starting from their first login date, then divide that number by the total number of players.
The result format is in the following example.
Example 1:
Input: Activity table: +-----------+-----------+------------+--------------+ | player_id | device_id | event_date | games_played | +-----------+-----------+------------+--------------+ | 1 | 2 | 2016-03-01 | 5 | | 1 | 2 | 2016-03-02 | 6 | | 2 | 3 | 2017-06-25 | 1 | | 3 | 1 | 2016-03-02 | 0 | | 3 | 4 | 2018-07-03 | 5 | +-----------+-----------+------------+--------------+ Output: +-----------+ | fraction | +-----------+ | 0.33 | +-----------+ Explanation: Only the player with id 1 logged back in after the first day he had logged in so the answer is 1/3 = 0.33
Problem Overview: The table Activity records player logins by date. The task is to compute the fraction of players who logged in again exactly one day after their first login. The result should be the number of such players divided by the total number of distinct players.
Approach 1: SQL Query for Aggregation (O(n) time, O(1) extra space)
The cleanest solution relies on SQL aggregation. First determine each player's first login date using MIN(event_date) grouped by player_id. Then check whether a record exists where the same player logged in on first_login_date + 1. This can be done using a self join or subquery. Finally compute the ratio of players who satisfy this condition divided by the total number of players. This approach works well because relational databases handle grouping and filtering efficiently. It primarily uses operations from database querying and SQL aggregation.
Approach 2: Data Processing with Python (O(n log n) time, O(n) space)
If you load the table into memory, you can process it using Python data structures. Group all login dates by player_id using a dictionary. For each player, sort their login dates and identify the earliest one. Then check whether the date first_login + 1 day appears in the player's login set. Maintain a counter for players who satisfy the condition and divide by the total number of players. Sorting introduces O(n log n) complexity, but using sets for membership checks keeps each lookup constant time.
Approach 3: Using SQL to Find Consecutive Logins (O(n) time, O(n) space)
This variation focuses on detecting consecutive dates directly. Instead of explicitly computing the next-day condition in application code, the SQL query joins the table with itself where a.player_id = b.player_id and b.event_date = DATE_ADD(a.event_date, INTERVAL 1 DAY). Restrict a.event_date to the player's first login using a subquery. The result identifies players who logged in on consecutive days starting from their first session. The technique is a common pattern when working with date handling in SQL.
Approach 4: Date Handling with Multiple Passes (O(n) time, O(n) space)
Another implementation scans the dataset in two passes. First pass computes the earliest login date for every player using a hash map. Second pass checks whether any record exists where the player's login date equals first_login + 1. Maintain a boolean flag per player to avoid double counting. This approach avoids sorting and works in linear time, making it suitable when implementing the logic in languages like C++ or Java.
Recommended for interviews: The SQL aggregation solution is what most interviewers expect for database problems. It shows you understand grouping, joins, and date arithmetic directly inside SQL. The multi-pass hash map approach demonstrates the same reasoning when implementing the logic in a general-purpose language.
| Approach | Time | Space | When to Use |
|---|---|---|---|
| SQL Aggregation with First Login | O(n) | O(1) | Best for SQL interviews and database queries |
| Python Data Processing | O(n log n) | O(n) | Useful when exporting data and analyzing outside the database |
| SQL Consecutive Login Join | O(n) | O(n) | When detecting consecutive dates directly with joins |
| Two-Pass Hash Map with Date Checks | O(n) | O(n) | Best when implementing logic in C++ or Java without SQL |