Table: Steps
+-------------+------+ | Column Name | Type | +-------------+------+ | user_id | int | | steps_count | int | | steps_date | date | +-------------+------+ (user_id, steps_date) is the primary key for this table. Each row of this table contains user_id, steps_count, and steps_date.
Write a solution to calculate 3-day rolling averages of steps for each user.
We calculate the n-day rolling average this way:
n consecutive days of step counts ending on that day if available, otherwise, n-day rolling average is not defined for it.Output the user_id, steps_date, and rolling average. Round the rolling average to two decimal places.
Return the result table ordered by user_id, steps_date in ascending order.
The result format is in the following example.
Example 1:
Input: Steps table: +---------+-------------+------------+ | user_id | steps_count | steps_date | +---------+-------------+------------+ | 1 | 687 | 2021-09-02 | | 1 | 395 | 2021-09-04 | | 1 | 499 | 2021-09-05 | | 1 | 712 | 2021-09-06 | | 1 | 576 | 2021-09-07 | | 2 | 153 | 2021-09-06 | | 2 | 171 | 2021-09-07 | | 2 | 530 | 2021-09-08 | | 3 | 945 | 2021-09-04 | | 3 | 120 | 2021-09-07 | | 3 | 557 | 2021-09-08 | | 3 | 840 | 2021-09-09 | | 3 | 627 | 2021-09-10 | | 5 | 382 | 2021-09-05 | | 6 | 480 | 2021-09-01 | | 6 | 191 | 2021-09-02 | | 6 | 303 | 2021-09-05 | +---------+-------------+------------+ Output: +---------+------------+-----------------+ | user_id | steps_date | rolling_average | +---------+------------+-----------------+ | 1 | 2021-09-06 | 535.33 | | 1 | 2021-09-07 | 595.67 | | 2 | 2021-09-08 | 284.67 | | 3 | 2021-09-09 | 505.67 | | 3 | 2021-09-10 | 674.67 | +---------+------------+-----------------+ Explanation: - For user id 1, the step counts for the three consecutive days up to 2021-09-06 are available. Consequently, the rolling average for this particular date is computed as (395 + 499 + 712) / 3 = 535.33. - For user id 1, the step counts for the three consecutive days up to 2021-09-07 are available. Consequently, the rolling average for this particular date is computed as (499 + 712 + 576) / 3 = 595.67. - For user id 2, the step counts for the three consecutive days up to 2021-09-08 are available. Consequently, the rolling average for this particular date is computed as (153 + 171 + 530) / 3 = 284.67. - For user id 3, the step counts for the three consecutive days up to 2021-09-09 are available. Consequently, the rolling average for this particular date is computed as (120 + 557 + 840) / 3 = 505.67. - For user id 3, the step counts for the three consecutive days up to 2021-09-10 are available. Consequently, the rolling average for this particular date is computed as (557 + 840 + 627) / 3 = 674.67. - For user id 4 and 5, the calculation of the rolling average is not viable as there is insufficient data for the consecutive three days. Output table ordered by user_id and steps_date in ascending order.
Problem Overview: The task asks you to compute a rolling average of steps over a fixed number of previous days for each user. Each row represents a daily step count, and you must calculate the average that includes the current day and a defined number of preceding days.
Approach 1: Self Join / Aggregation (O(n²) time, O(n) space)
A straightforward method joins the table with itself to gather rows from previous days for the same user. For every row, filter records where the date falls within the rolling window (for example, the current day and the previous two days). Then apply AVG() over those rows. This approach works in any SQL dialect without advanced analytics features, but it performs poorly on large datasets because every row scans multiple other rows from the same table. The repeated join operations lead to quadratic growth as the dataset grows.
Approach 2: Window Functions (O(n log n) time, O(1) extra space)
The optimal solution uses SQL window functions. First partition the data by user_id so each user's records are processed independently. Then order each partition by the step date. A window frame such as ROWS BETWEEN 2 PRECEDING AND CURRENT ROW defines the rolling range. Applying AVG(steps) over this window automatically calculates the rolling average without additional joins. The database internally sorts rows within each partition, which is why the effective complexity is around O(n log n). This approach is concise, efficient, and scales well with large datasets.
Window functions are widely used in analytics queries where calculations depend on nearby rows. Problems involving running totals, ranking, or sliding averages typically rely on the same technique. Understanding how to use PARTITION BY, ORDER BY, and window frames is essential when solving SQL analytics problems.
Recommended for interviews: Interviewers expect the window function approach. The self-join version demonstrates that you understand how to derive the rolling range manually, but the optimized solution shows familiarity with modern SQL analytics features. Practice similar patterns in database query problems and sliding calculations using window functions, since they frequently appear in data engineering and analytics interviews.
We can use the window function LAG() OVER() to calculate the difference in days between the current date and the date before the last date for each user. If the difference is 2, it means that there are continuous data for 3 days between these two dates. We can use the window function AVG() OVER() to calculate the average of these 3 data.
MySQL
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Self Join with Aggregation | O(n²) | O(n) | When window functions are unavailable or when demonstrating the basic logic of rolling ranges. |
| Window Functions (PARTITION + ORDER + Frame) | O(n log n) | O(1) extra | Best for analytical SQL queries where calculations depend on neighboring rows. Scales well for large datasets. |
Leetcode MEDIUM 2854 - Rolling Average Steps - Using ROWS BETWEEN in SQL | Everyday Data Science • Everyday Data Science • 781 views views
Watch 1 more video solutions →Practice Rolling Average Steps with our built-in code editor and test cases.
Practice on FleetCodePractice this problem
Open in Editor