Hopper Company Queries II - Solution & Explanation

HardPremiumFree on FleetCodeDatabase7 min readAsked at: Uber

Problem Statement

Table: Drivers

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| driver_id   | int     |
| join_date   | date    |
+-------------+---------+
driver_id is the column with unique values for this table.
Each row of this table contains the driver's ID and the date they joined the Hopper company.

Table: Rides

+--------------+---------+
| Column Name  | Type    |
+--------------+---------+
| ride_id      | int     |
| user_id      | int     |
| requested_at | date    |
+--------------+---------+
ride_id is the column with unique values for this table.
Each row of this table contains the ID of a ride, the user's ID that requested it, and the day they requested it.
There may be some ride requests in this table that were not accepted.

Table: AcceptedRides

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| ride_id       | int     |
| driver_id     | int     |
| ride_distance | int     |
| ride_duration | int     |
+---------------+---------+
ride_id is the column with unique values for this table.
Each row of this table contains some information about an accepted ride.
It is guaranteed that each accepted ride exists in the Rides table.

Write a solution to report the percentage of working drivers (working_percentage) for each month of 2020 where:

Note that if the number of available drivers during a month is zero, we consider the working_percentage to be 0.

Return the result table ordered by month in ascending order, where month is the month's number (January is 1, February is 2, etc.). Round working_percentage to the nearest 2 decimal places.

The result format is in the following example.

Example 1:

Input: 
Drivers table:
+-----------+------------+
| driver_id | join_date  |
+-----------+------------+
| 10        | 2019-12-10 |
| 8         | 2020-1-13  |
| 5         | 2020-2-16  |
| 7         | 2020-3-8   |
| 4         | 2020-5-17  |
| 1         | 2020-10-24 |
| 6         | 2021-1-5   |
+-----------+------------+
Rides table:
+---------+---------+--------------+
| ride_id | user_id | requested_at |
+---------+---------+--------------+
| 6       | 75      | 2019-12-9    |
| 1       | 54      | 2020-2-9     |
| 10      | 63      | 2020-3-4     |
| 19      | 39      | 2020-4-6     |
| 3       | 41      | 2020-6-3     |
| 13      | 52      | 2020-6-22    |
| 7       | 69      | 2020-7-16    |
| 17      | 70      | 2020-8-25    |
| 20      | 81      | 2020-11-2    |
| 5       | 57      | 2020-11-9    |
| 2       | 42      | 2020-12-9    |
| 11      | 68      | 2021-1-11    |
| 15      | 32      | 2021-1-17    |
| 12      | 11      | 2021-1-19    |
| 14      | 18      | 2021-1-27    |
+---------+---------+--------------+
AcceptedRides table:
+---------+-----------+---------------+---------------+
| ride_id | driver_id | ride_distance | ride_duration |
+---------+-----------+---------------+---------------+
| 10      | 10        | 63            | 38            |
| 13      | 10        | 73            | 96            |
| 7       | 8         | 100           | 28            |
| 17      | 7         | 119           | 68            |
| 20      | 1         | 121           | 92            |
| 5       | 7         | 42            | 101           |
| 2       | 4         | 6             | 38            |
| 11      | 8         | 37            | 43            |
| 15      | 8         | 108           | 82            |
| 12      | 8         | 38            | 34            |
| 14      | 1         | 90            | 74            |
+---------+-----------+---------------+---------------+
Output: 
+-------+--------------------+
| month | working_percentage |
+-------+--------------------+
| 1     | 0.00               |
| 2     | 0.00               |
| 3     | 25.00              |
| 4     | 0.00               |
| 5     | 0.00               |
| 6     | 20.00              |
| 7     | 20.00              |
| 8     | 20.00              |
| 9     | 0.00               |
| 10    | 0.00               |
| 11    | 33.33              |
| 12    | 16.67              |
+-------+--------------------+
Explanation: 
By the end of January --> two active drivers (10, 8) and no accepted rides. The percentage is 0%.
By the end of February --> three active drivers (10, 8, 5) and no accepted rides. The percentage is 0%.
By the end of March --> four active drivers (10, 8, 5, 7) and one accepted ride by driver (10). The percentage is (1 / 4) * 100 = 25%.
By the end of April --> four active drivers (10, 8, 5, 7) and no accepted rides. The percentage is 0%.
By the end of May --> five active drivers (10, 8, 5, 7, 4) and no accepted rides. The percentage is 0%.
By the end of June --> five active drivers (10, 8, 5, 7, 4) and one accepted ride by driver (10). The percentage is (1 / 5) * 100 = 20%.
By the end of July --> five active drivers (10, 8, 5, 7, 4) and one accepted ride by driver (8). The percentage is (1 / 5) * 100 = 20%.
By the end of August --> five active drivers (10, 8, 5, 7, 4) and one accepted ride by driver (7). The percentage is (1 / 5) * 100 = 20%.
By the end of September --> five active drivers (10, 8, 5, 7, 4) and no accepted rides. The percentage is 0%.
By the end of October --> six active drivers (10, 8, 5, 7, 4, 1) and no accepted rides. The percentage is 0%.
By the end of November --> six active drivers (10, 8, 5, 7, 4, 1) and two accepted rides by two different drivers (1, 7). The percentage is (2 / 6) * 100 = 33.33%.
By the end of December --> six active drivers (10, 8, 5, 7, 4, 1) and one accepted ride by driver (4). The percentage is (1 / 6) * 100 = 16.67%.

Approach Overview

Problem Overview: You need to generate monthly ride statistics for Hopper. For every month in 2020, compute the average ride distance and average ride duration of completed rides. The catch is that months with no rides must still appear in the output with values set to 0.

Approach 1: Aggregation with Calendar Table + LEFT JOIN (O(n) time, O(1) extra space)

The core idea is to aggregate ride metrics per month and then join those results with a generated list of all months from 1 to 12. First join the Rides and AcceptedRides tables so that only completed rides are considered. Extract the month from requested_at, then compute AVG(ride_distance) and AVG(ride_duration) using GROUP BY MONTH(...). This produces averages only for months that actually contain rides.

Next create a month sequence (1–12) using a recursive CTE or a small derived table. Perform a LEFT JOIN between the month list and the aggregated ride statistics. This guarantees that months without rides remain in the result. Use COALESCE to replace NULL averages with 0. Finally round the averages to the required decimal precision and order by month.

The important insight is separating the problem into two parts: computing metrics from real ride data, and generating a complete calendar dimension to fill missing months. SQL reporting problems often follow this pattern. The join ensures that sparse transactional data still produces a continuous time series.

This approach relies heavily on SQL aggregation, date extraction, and outer joins. These patterns frequently appear in database interview questions and analytics workloads. Understanding how to build a synthetic date table and merge it with aggregated metrics is a common requirement when working with database queries, SQL aggregation, and data analysis queries.

Recommended for interviews: The aggregation + calendar join approach is the expected solution. A naive query that only groups existing ride records misses months with zero rides, which fails the requirement. Showing the ability to construct a month sequence and combine it with aggregated results demonstrates strong SQL reporting skills.

Solution

Code

MySQL

Try this approach in the editor →

Detailed Complexity Analysis

Approach	Time	Space	When to Use
Direct GROUP BY on ride tables	O(n)	O(1)	When you only need months that actually contain rides
Calendar table + LEFT JOIN aggregation (optimal)	O(n)	O(1)	When missing months must appear with zero values
Recursive CTE month generation + aggregation	O(n)	O(12)	When a physical calendar table does not exist in the database

Video Solution

Leetcode HARD 1645 - RECURSIVE CTE SQL Explained - Hopper Company Queries 2 | Everyday Data Science • Everyday Data Science • 675 views views

Watch 1 more video solutions →

Frequently Asked Questions

Is Hopper Company Queries II easy or hard?

Hopper Company Queries II is labeled Hard because it requires combining multiple SQL concepts: joining datasets, aggregating metrics, handling time-based grouping, and ensuring missing months appear in the final output.

Hopper Company Queries II Python/Java solution

This problem is designed for SQL rather than general-purpose languages. The solution is implemented as a MySQL query using joins, aggregation functions like AVG, and a derived or recursive table to generate months.

How to solve Hopper Company Queries II in O(n)?

Join the Rides and AcceptedRides tables to filter completed rides, extract the month from the ride date, and compute AVG(distance) and AVG(duration) using GROUP BY. Generate a list of months 1–12 and LEFT JOIN it with the aggregated result, replacing NULL values with 0.

What is the best approach for Hopper Company Queries II?

The most reliable solution uses SQL aggregation combined with a generated calendar table. First compute average ride distance and duration per month from joined ride tables, then LEFT JOIN the result with a list of months 1–12. This ensures months without rides still appear with values set to 0.

Is Hopper Company Queries II asked at Google/Amazon/Meta?

SQL aggregation and reporting questions similar to Hopper Company Queries II frequently appear in data engineering and analytics interviews at companies like Google, Amazon, and Meta. They test joins, grouping, and handling missing time-series data.

What data structure is used in Hopper Company Queries II?

The problem relies on relational database tables and SQL operations rather than traditional data structures. Key concepts include joins between tables, GROUP BY aggregation, and a generated calendar table for time-series completeness.

What is the time complexity of Hopper Company Queries II?

The query scans the rides data once to compute aggregates, so the time complexity is O(n) where n is the number of ride records. The month table contains only 12 rows, so the join cost is negligible.

Ready to solve this problem?

Practice Hopper Company Queries II with our built-in code editor and test cases.

Practice on FleetCode

Hopper Company Queries I

Hopper Company Queries III

Problem Info

DifficultyHard

Acceptance39.1%

Approaches1

Reading time7 min

Asked at

Uber

Practice this problem

Open in Editor

Problem Statement

Table: Drivers

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| driver_id   | int     |
| join_date   | date    |
+-------------+---------+
driver_id is the column with unique values for this table.
Each row of this table contains the driver's ID and the date they joined the Hopper company.

Table: Rides

+--------------+---------+
| Column Name  | Type    |
+--------------+---------+
| ride_id      | int     |
| user_id      | int     |
| requested_at | date    |
+--------------+---------+
ride_id is the column with unique values for this table.
Each row of this table contains the ID of a ride, the user's ID that requested it, and the day they requested it.
There may be some ride requests in this table that were not accepted.

Table: AcceptedRides

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| ride_id       | int     |
| driver_id     | int     |
| ride_distance | int     |
| ride_duration | int     |
+---------------+---------+
ride_id is the column with unique values for this table.
Each row of this table contains some information about an accepted ride.
It is guaranteed that each accepted ride exists in the Rides table.

Write a solution to report the percentage of working drivers (working_percentage) for each month of 2020 where:

Note that if the number of available drivers during a month is zero, we consider the working_percentage to be 0.

The result format is in the following example.

Example 1:

Input: 
Drivers table:
+-----------+------------+
| driver_id | join_date  |
+-----------+------------+
| 10        | 2019-12-10 |
| 8         | 2020-1-13  |
| 5         | 2020-2-16  |
| 7         | 2020-3-8   |
| 4         | 2020-5-17  |
| 1         | 2020-10-24 |
| 6         | 2021-1-5   |
+-----------+------------+
Rides table:
+---------+---------+--------------+
| ride_id | user_id | requested_at |
+---------+---------+--------------+
| 6       | 75      | 2019-12-9    |
| 1       | 54      | 2020-2-9     |
| 10      | 63      | 2020-3-4     |
| 19      | 39      | 2020-4-6     |
| 3       | 41      | 2020-6-3     |
| 13      | 52      | 2020-6-22    |
| 7       | 69      | 2020-7-16    |
| 17      | 70      | 2020-8-25    |
| 20      | 81      | 2020-11-2    |
| 5       | 57      | 2020-11-9    |
| 2       | 42      | 2020-12-9    |
| 11      | 68      | 2021-1-11    |
| 15      | 32      | 2021-1-17    |
| 12      | 11      | 2021-1-19    |
| 14      | 18      | 2021-1-27    |
+---------+---------+--------------+
AcceptedRides table:
+---------+-----------+---------------+---------------+
| ride_id | driver_id | ride_distance | ride_duration |
+---------+-----------+---------------+---------------+
| 10      | 10        | 63            | 38            |
| 13      | 10        | 73            | 96            |
| 7       | 8         | 100           | 28            |
| 17      | 7         | 119           | 68            |
| 20      | 1         | 121           | 92            |
| 5       | 7         | 42            | 101           |
| 2       | 4         | 6             | 38            |
| 11      | 8         | 37            | 43            |
| 15      | 8         | 108           | 82            |
| 12      | 8         | 38            | 34            |
| 14      | 1         | 90            | 74            |
+---------+-----------+---------------+---------------+
Output: 
+-------+--------------------+
| month | working_percentage |
+-------+--------------------+
| 1     | 0.00               |
| 2     | 0.00               |
| 3     | 25.00              |
| 4     | 0.00               |
| 5     | 0.00               |
| 6     | 20.00              |
| 7     | 20.00              |
| 8     | 20.00              |
| 9     | 0.00               |
| 10    | 0.00               |
| 11    | 33.33              |
| 12    | 16.67              |
+-------+--------------------+
Explanation: 
By the end of January --> two active drivers (10, 8) and no accepted rides. The percentage is 0%.
By the end of February --> three active drivers (10, 8, 5) and no accepted rides. The percentage is 0%.
By the end of March --> four active drivers (10, 8, 5, 7) and one accepted ride by driver (10). The percentage is (1 / 4) * 100 = 25%.
By the end of April --> four active drivers (10, 8, 5, 7) and no accepted rides. The percentage is 0%.
By the end of May --> five active drivers (10, 8, 5, 7, 4) and no accepted rides. The percentage is 0%.
By the end of June --> five active drivers (10, 8, 5, 7, 4) and one accepted ride by driver (10). The percentage is (1 / 5) * 100 = 20%.
By the end of July --> five active drivers (10, 8, 5, 7, 4) and one accepted ride by driver (8). The percentage is (1 / 5) * 100 = 20%.
By the end of August --> five active drivers (10, 8, 5, 7, 4) and one accepted ride by driver (7). The percentage is (1 / 5) * 100 = 20%.
By the end of September --> five active drivers (10, 8, 5, 7, 4) and no accepted rides. The percentage is 0%.
By the end of October --> six active drivers (10, 8, 5, 7, 4, 1) and no accepted rides. The percentage is 0%.
By the end of November --> six active drivers (10, 8, 5, 7, 4, 1) and two accepted rides by two different drivers (1, 7). The percentage is (2 / 6) * 100 = 33.33%.
By the end of December --> six active drivers (10, 8, 5, 7, 4, 1) and one accepted ride by driver (4). The percentage is (1 / 6) * 100 = 16.67%.

Approach Overview

Approach 1: Aggregation with Calendar Table + LEFT JOIN (O(n) time, O(1) extra space)

Detailed Complexity Analysis

Approach	Time	Space	When to Use
Direct GROUP BY on ride tables	O(n)	O(1)	When you only need months that actually contain rides
Calendar table + LEFT JOIN aggregation (optimal)	O(n)	O(1)	When missing months must appear with zero values
Recursive CTE month generation + aggregation	O(n)	O(12)	When a physical calendar table does not exist in the database

Frequently Asked Questions

Is Hopper Company Queries II easy or hard?

Hopper Company Queries II Python/Java solution

How to solve Hopper Company Queries II in O(n)?

What is the best approach for Hopper Company Queries II?

Is Hopper Company Queries II asked at Google/Amazon/Meta?

What data structure is used in Hopper Company Queries II?

What is the time complexity of Hopper Company Queries II?

The query scans the rides data once to compute aggregates, so the time complexity is O(n) where n is the number of ride records. The month table contains only 12 rows, so the join cost is negligible.

Hopper Company Queries II - Solution & Explanation

Problem Statement

Approach Overview

Solution

Code

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Ready to solve this problem?

Problem Info

Table of Contents

Hopper Company Queries II - Solution & Explanation

Problem Statement

Approach Overview

Solution

Code

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Ready to solve this problem?

Problem Info

Table of Contents

Problem Statement

Approach Overview

Solution

Code

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Related Problems

Ready to solve this problem?

Problem Info

Table of Contents

Problem Statement

Approach Overview

Solution

Code

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Related Problems

Ready to solve this problem?

Problem Info

Table of Contents