Table: Drivers
+--------------+---------+ | Column Name | Type | +--------------+---------+ | driver_id | int | | name | varchar | | age | int | | experience | int | | accidents | int | +--------------+---------+ (driver_id) is the unique key for this table. Each row includes a driver's ID, their name, age, years of driving experience, and the number of accidents they’ve had.
Table: Vehicles
+--------------+---------+ | vehicle_id | int | | driver_id | int | | model | varchar | | fuel_type | varchar | | mileage | int | +--------------+---------+ (vehicle_id, driver_id, fuel_type) is the unique key for this table. Each row includes the vehicle's ID, the driver who operates it, the model, fuel type, and mileage.
Table: Trips
+--------------+---------+ | trip_id | int | | vehicle_id | int | | distance | int | | duration | int | | rating | int | +--------------+---------+ (trip_id) is the unique key for this table. Each row includes a trip's ID, the vehicle used, the distance covered (in miles), the trip duration (in minutes), and the passenger's rating (1-5).
Uber is analyzing drivers based on their trips. Write a solution to find the top-performing driver for each fuel type based on the following criteria:
2 decimal places.Return the result table ordered by fuel_type in ascending order.
The result format is in the following example.
Example:
Input:
Drivers table:
+-----------+----------+-----+------------+-----------+ | driver_id | name | age | experience | accidents | +-----------+----------+-----+------------+-----------+ | 1 | Alice | 34 | 10 | 1 | | 2 | Bob | 45 | 20 | 3 | | 3 | Charlie | 28 | 5 | 0 | +-----------+----------+-----+------------+-----------+
Vehicles table:
+------------+-----------+---------+-----------+---------+ | vehicle_id | driver_id | model | fuel_type | mileage | +------------+-----------+---------+-----------+---------+ | 100 | 1 | Sedan | Gasoline | 20000 | | 101 | 2 | SUV | Electric | 30000 | | 102 | 3 | Coupe | Gasoline | 15000 | +------------+-----------+---------+-----------+---------+
Trips table:
+---------+------------+----------+----------+--------+ | trip_id | vehicle_id | distance | duration | rating | +---------+------------+----------+----------+--------+ | 201 | 100 | 50 | 30 | 5 | | 202 | 100 | 30 | 20 | 4 | | 203 | 101 | 100 | 60 | 4 | | 204 | 101 | 80 | 50 | 5 | | 205 | 102 | 40 | 30 | 5 | | 206 | 102 | 60 | 40 | 5 | +---------+------------+----------+----------+--------+
Output:
+-----------+-----------+--------+----------+ | fuel_type | driver_id | rating | distance | +-----------+-----------+--------+----------+ | Electric | 2 | 4.50 | 180 | | Gasoline | 3 | 5.00 | 100 | +-----------+-----------+--------+----------+
Explanation:
Gasoline, both Alice (Driver 1) and Charlie (Driver 3) have trips. Charlie has an average rating of 5.0, while Alice has 4.5. Therefore, Charlie is selected.Electric, Bob (Driver 2) is the only driver with an average rating of 4.5, so he is selected.The output table is ordered by fuel_type in ascending order.
Problem Overview: You are given database tables containing ride or trip information and driver records. The task is to determine which driver performs the best based on aggregated ride metrics such as completed trips, earnings, or ratings. The final result typically requires ranking drivers after computing totals across their rides.
Approach 1: Aggregation with ORDER BY and LIMIT (O(n log n) time, O(1) extra space)
The simplest approach aggregates driver performance metrics using GROUP BY. You join the relevant tables using an equi-join (for example, driver ID linking trips to drivers), then compute totals such as COUNT(), SUM(), or AVG(). After computing these aggregated values, sort the results using ORDER BY based on the performance metric and return the top row with LIMIT 1. The database performs sorting internally, which usually costs O(n log n) for n grouped drivers. This works well when you only need the single best driver.
This method is straightforward and readable. However, it becomes less flexible if the problem requires ranking multiple drivers or handling ties explicitly.
Approach 2: Equi-Join + Grouping + Window Function (O(n log n) time, O(n) space)
A more scalable solution calculates driver metrics first and then ranks them using a window function. Start by joining the necessary tables using an equi-join on the driver identifier. Use GROUP BY to aggregate performance metrics for each driver. This step produces a derived dataset where each row represents a driver and their computed statistics.
Next, apply a window function such as RANK() or DENSE_RANK() with OVER (ORDER BY metric DESC). The window function assigns a ranking without collapsing rows further. You can then filter rows where the rank equals 1 to retrieve the top-performing driver. Because window functions require ordered processing, the database performs sorting internally, leading to roughly O(n log n) time complexity for n aggregated drivers.
This approach is common in production analytics queries. Window functions allow flexible ranking, easy tie handling, and extension to top-k queries without rewriting the aggregation logic. It also keeps the query readable even when multiple ranking metrics are involved.
Key SQL concepts used here include SQL joins to combine tables, GROUP BY aggregation to compute driver-level metrics, and window functions for ranking results.
Recommended for interviews: The equi-join + grouping + window function approach is typically what interviewers expect. Starting with the simple aggregation + sorting solution shows you understand SQL fundamentals. Moving to window functions demonstrates stronger SQL fluency and the ability to write scalable analytical queries.
We can use equi-join to join the Drivers table with the Vehicles table on driver_id, and then join with the Trips table on vehicle_id. Next, we group by fuel_type and driver_id to calculate each driver's average rating, total mileage, and total accident count. Then, using the RANK() window function, we rank the drivers of each fuel type in descending order of rating, descending order of total mileage, and ascending order of total accident count. Finally, we filter out the driver ranked 1 for each fuel type.
MySQL
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Aggregation + ORDER BY + LIMIT | O(n log n) | O(1) | When only the single top driver is needed and the query can rely on sorting |
| Equi-Join + Grouping + Window Function | O(n log n) | O(n) | Best for ranking drivers, handling ties, or retrieving top-k performers |
Leetcode MEDIUM 3308 - Find Top Performing Driver - ROW_NUMBER() in SQL | Everyday Data Science • Everyday Data Science • 603 views views
Practice Find Top Performing Driver with our built-in code editor and test cases.
Practice on FleetCodePractice this problem
Open in Editor