Watch the video solution for Number of Times a Driver Was a Passenger, a medium level problem involving Database. This walkthrough by Everyday Data Science has 2,880 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.
Table: Rides
+--------------+------+ | Column Name | Type | +--------------+------+ | ride_id | int | | driver_id | int | | passenger_id | int | +--------------+------+ ride_id is the column with unique values for this table. Each row of this table contains the ID of the driver and the ID of the passenger that rode in ride_id. Note that driver_id != passenger_id.
Write a solution to report the ID of each driver and the number of times they were a passenger.
Return the result table in any order.
The result format is in the following example.
Example 1:
Input: Rides table: +---------+-----------+--------------+ | ride_id | driver_id | passenger_id | +---------+-----------+--------------+ | 1 | 7 | 1 | | 2 | 7 | 2 | | 3 | 11 | 1 | | 4 | 11 | 7 | | 5 | 11 | 7 | | 6 | 11 | 3 | +---------+-----------+--------------+ Output: +-----------+-----+ | driver_id | cnt | +-----------+-----+ | 7 | 2 | | 11 | 0 | +-----------+-----+ Explanation: There are two drivers in all the given rides: 7 and 11. The driver with ID = 7 was a passenger two times. The driver with ID = 11 was never a passenger.
Problem Overview: The Rides table records each trip with a driver_id and passenger_id. Some drivers also take rides as passengers. The task is to list every driver and count how many times that same person appears as a passenger in other rides.
Approach 1: DISTINCT Drivers + LEFT JOIN Aggregation (O(n) time, O(d) space)
Start by extracting the set of all drivers using SELECT DISTINCT driver_id. This ensures the output includes every driver who has ever driven a ride. Then perform a LEFT JOIN with the Rides table where driver_id from the driver set matches passenger_id. Each match represents a ride where that driver acted as a passenger. Use COUNT(r.ride_id) with GROUP BY driver_id to compute the total occurrences. The LEFT JOIN guarantees drivers with zero passenger rides still appear with a count of 0. The query performs a linear scan over the rides table, giving O(n) time, while storing the distinct driver set requires O(d) space.
This pattern is common in database interview questions: build a reference set first, then join against the same table for analysis. The key insight is separating the driver universe from the passenger occurrences. Without that separation, drivers who never appear as passengers would disappear from the result.
Approach 2: Subquery Count per Driver (O(n * d) worst case, O(1) extra space)
Another option uses a correlated subquery. First select distinct drivers, then compute the passenger count using a nested query such as (SELECT COUNT(*) FROM Rides r WHERE r.passenger_id = d.driver_id). This avoids explicit joins but runs the counting query for each driver. If there are many drivers, the database repeatedly scans the rides table. In the worst case this leads to O(n * d) time. Space usage stays minimal because the database evaluates counts on demand.
While shorter to write, correlated subqueries are usually slower than a join with aggregation. Query planners sometimes optimize them, but relying on that is risky in interviews.
The problem mainly tests understanding of SQL grouping and LEFT JOIN semantics. Specifically, recognizing when you must preserve rows from one side of a join to keep zero-count results.
Recommended for interviews: Use the DISTINCT driver_id + LEFT JOIN + GROUP BY approach. It scales well, clearly shows the relationship between drivers and passenger appearances, and demonstrates proper use of aggregation. Mentioning the correlated subquery alternative shows awareness of simpler but less efficient patterns.
| Approach | Time | Space | When to Use |
|---|---|---|---|
| DISTINCT Drivers + LEFT JOIN + GROUP BY | O(n) | O(d) | Best general solution; preserves drivers with zero passenger rides |
| Correlated Subquery Count | O(n × d) worst case | O(1) | Simpler query but less efficient on large datasets |