Watch 4 video solutions for Grand Slam Titles, a medium level problem involving Database. This walkthrough by Everyday Data Science has 7,021 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.
Table: Players
+----------------+---------+ | Column Name | Type | +----------------+---------+ | player_id | int | | player_name | varchar | +----------------+---------+ player_id is the primary key (column with unique values) for this table. Each row in this table contains the name and the ID of a tennis player.
Table: Championships
+---------------+---------+ | Column Name | Type | +---------------+---------+ | year | int | | Wimbledon | int | | Fr_open | int | | US_open | int | | Au_open | int | +---------------+---------+ year is the primary key (column with unique values) for this table. Each row of this table contains the IDs of the players who won one each tennis tournament of the grand slam.
Write a solution to report the number of grand slam tournaments won by each player. Do not include the players who did not win any tournament.
Return the result table in any order.
The result format is in the following example.
Example 1:
Input: Players table: +-----------+-------------+ | player_id | player_name | +-----------+-------------+ | 1 | Nadal | | 2 | Federer | | 3 | Novak | +-----------+-------------+ Championships table: +------+-----------+---------+---------+---------+ | year | Wimbledon | Fr_open | US_open | Au_open | +------+-----------+---------+---------+---------+ | 2018 | 1 | 1 | 1 | 1 | | 2019 | 1 | 1 | 2 | 2 | | 2020 | 2 | 1 | 2 | 2 | +------+-----------+---------+---------+---------+ Output: +-----------+-------------+-------------------+ | player_id | player_name | grand_slams_count | +-----------+-------------+-------------------+ | 2 | Federer | 5 | | 1 | Nadal | 7 | +-----------+-------------+-------------------+ Explanation: Player 1 (Nadal) won 7 titles: Wimbledon (2018, 2019), Fr_open (2018, 2019, 2020), US_open (2018), and Au_open (2018). Player 2 (Federer) won 5 titles: Wimbledon (2020), US_open (2019, 2020), and Au_open (2019, 2020). Player 3 (Novak) did not win anything, we did not include them in the result table.
Problem Overview: The database stores tennis tournament winners for four Grand Slam events (Wimbledon, French Open, US Open, and Australian Open) in separate columns for each year. The task is to compute how many Grand Slam titles each player has won and return their player_id, player_name, and total titles.
Approach 1: Union All + Equi-Join + Group By (O(n) time, O(n) space)
The championships table stores winners across four different columns, which means the first step is converting those columns into a single column of player IDs. Use UNION ALL to vertically combine the four tournament columns into one result set containing all winners. This creates a normalized stream of champion records where each row represents one title. After that, perform an equi-join with the Players table to map each player_id to the corresponding player_name. Finally, apply GROUP BY player_id and count the number of appearances to compute the total Grand Slam titles. The query scans the championship records once and aggregates efficiently, giving overall O(n) time where n is the number of rows in the championships table.
This approach works well because SQL aggregation is designed for exactly this kind of counting task. UNION ALL avoids unnecessary duplicate elimination overhead that comes with UNION. The combination of vertical flattening and aggregation is a common pattern in SQL interview questions involving denormalized schemas.
Approach 2: Derived Table Aggregation (O(n) time, O(n) space)
Another way to structure the same logic is by building a derived table (subquery) that lists all winners first, then aggregating in an outer query. The inner query performs four SELECT statements on each tournament column and combines them with UNION ALL. This derived table effectively represents every title ever awarded as individual rows. The outer query then joins that result with the Players table and uses GROUP BY to compute the count per player.
The benefit of this structure is clarity. Separating the normalization step from the aggregation step makes the query easier to reason about during interviews or debugging. Databases optimize derived tables efficiently, so the complexity remains O(n) time and O(n) intermediate space.
Both approaches rely heavily on database fundamentals such as joins and aggregations. Understanding how to reshape data with UNION ALL and summarize results using GROUP BY is the key insight.
Recommended for interviews: The Union All + Join + Group By approach is the expected solution. Interviewers want to see that you recognize the schema is denormalized and know how to transform multiple columns into rows before aggregating. Writing the union clearly and grouping correctly demonstrates solid SQL fundamentals.
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Union All + Equi-Join + Group By | O(n) | O(n) | Best general solution when multiple columns represent similar entities and must be normalized before aggregation |
| Derived Table Aggregation | O(n) | O(n) | Useful when you want clearer query structure by separating normalization and aggregation steps |