Table: DailySales
+-------------+---------+ | Column Name | Type | +-------------+---------+ | date_id | date | | make_name | varchar | | lead_id | int | | partner_id | int | +-------------+---------+ There is no primary key (column with unique values) for this table. It may contain duplicates. This table contains the date and the name of the product sold and the IDs of the lead and partner it was sold to. The name consists of only lowercase English letters.
For each date_id and make_name, find the number of distinct lead_id's and distinct partner_id's.
Return the result table in any order.
The result format is in the following example.
Example 1:
Input: DailySales table: +-----------+-----------+---------+------------+ | date_id | make_name | lead_id | partner_id | +-----------+-----------+---------+------------+ | 2020-12-8 | toyota | 0 | 1 | | 2020-12-8 | toyota | 1 | 0 | | 2020-12-8 | toyota | 1 | 2 | | 2020-12-7 | toyota | 0 | 2 | | 2020-12-7 | toyota | 0 | 1 | | 2020-12-8 | honda | 1 | 2 | | 2020-12-8 | honda | 2 | 1 | | 2020-12-7 | honda | 0 | 1 | | 2020-12-7 | honda | 1 | 2 | | 2020-12-7 | honda | 2 | 1 | +-----------+-----------+---------+------------+ Output: +-----------+-----------+--------------+-----------------+ | date_id | make_name | unique_leads | unique_partners | +-----------+-----------+--------------+-----------------+ | 2020-12-8 | toyota | 2 | 3 | | 2020-12-7 | toyota | 1 | 2 | | 2020-12-8 | honda | 2 | 2 | | 2020-12-7 | honda | 3 | 2 | +-----------+-----------+--------------+-----------------+ Explanation: For 2020-12-8, toyota gets leads = [0, 1] and partners = [0, 1, 2] while honda gets leads = [1, 2] and partners = [1, 2]. For 2020-12-7, toyota gets leads = [0] and partners = [1, 2] while honda gets leads = [0, 1, 2] and partners = [1, 2].
Problem Overview: You receive a sales log where each row contains date_id, make_name, lead_id, and partner_id. The task is to aggregate this data so that for every unique pair of date_id and make_name, you return the number of distinct leads and distinct partners.
Approach 1: Group By with Set Aggregation (O(n) time, O(n) space)
This approach simulates SQL aggregation using in-memory data structures. Iterate through every record and group rows by the key (date_id, make_name) using a hash map. For each group, maintain two sets: one for lead_id values and another for partner_id. Inserting into a set automatically removes duplicates, so the final counts are simply the sizes of these sets. This approach works well when solving the problem in languages like Python or JavaScript outside a database environment. It relies on hash table lookups and set operations to keep the solution linear.
Approach 2: SQL Group By and Count Distinct (O(n) time, O(1) extra space)
Databases already provide optimized aggregation primitives, making SQL the cleanest solution. Group the rows by date_id and make_name, then compute the number of unique identifiers using COUNT(DISTINCT lead_id) and COUNT(DISTINCT partner_id). The database engine performs grouping and deduplication internally using optimized indexing and aggregation algorithms. This approach is standard for SQL and database interview questions because it expresses the entire solution in a single query.
Recommended for interviews: The SQL GROUP BY with COUNT(DISTINCT) is the expected solution since the problem is fundamentally a database aggregation task. Demonstrating the set-based grouping approach in a general-purpose language shows that you understand how SQL aggregation works internally, but the SQL query is what interviewers typically look for.
In this approach, the goal is to group the data by both `date_id` and `make_name`. While iterating over the data, for each group, maintain two sets, one for distinct `lead_id`s and one for distinct `partner_id`s. Ensure to keep track of only unique IDs by using sets. Once all data in a group has been processed, the size of each set gives the number of unique IDs.
In the Python solution, a dictionary with nested dictionaries is used to group data by `date_id` and `make_name`. Each combination keeps two sets for distinct `lead_id` and `partner_id`. The final counts are added to the result.
Python
JavaScript
The time complexity is O(n), where n is the number of entries, as we process each entry once. The space complexity is O(u), where u is the number of unique (date_id, make_name) combinations times the distinct IDs per combination.
SQL can efficiently handle this task using aggregation functions. The plan here is to use the GROUP BY clause with COUNT(DISTINCT ...) to find distinct lead and partner counts for each combination of `date_id` and `make_name`.
The SQL approach utilizes the GROUP BY clause to aggregate results based on `date_id` and `make_name`. The COUNT(DISTINCT ...) function is used twice to determine the number of unique `lead_id`s and `partner_id`s within each group. The result is a table with the necessary counts per unique date and make.
SQL
The time complexity depends on the database indices and can often be O(n log n) due to sorting and deduplication. The space complexity is determined by the size of the intermediate tables, typically O(g) where g is the size of the distinct groups formed.
We can use the GROUP BY statement to group the data by the date_id and make_name fields, and then use the COUNT(DISTINCT) function to count the number of distinct values for lead_id and partner_id.
MySQL
| Approach | Complexity |
|---|---|
| Approach 1: Group By with Set Aggregation | The time complexity is O(n), where n is the number of entries, as we process each entry once. The space complexity is O(u), where u is the number of unique (date_id, make_name) combinations times the distinct IDs per combination. |
| Approach 2: SQL Group By and Count Distinct | The time complexity depends on the database indices and can often be O(n log n) due to sorting and deduplication. The space complexity is determined by the size of the intermediate tables, typically O(g) where g is the size of the distinct groups formed. |
| Group By + Count Distinct | — |
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Group By with Set Aggregation | O(n) | O(n) | When solving outside SQL using Python or JavaScript and simulating aggregation with hash maps and sets |
| SQL GROUP BY with COUNT DISTINCT | O(n) | O(1) extra | Best choice for database queries and SQL interviews where aggregation is handled by the database engine |
LeetCode 1693 Interview SQL Question with Detailed Explanation | Practice SQL • Everyday Data Science • 5,054 views views
Watch 9 more video solutions →Practice Daily Leads and Partners with our built-in code editor and test cases.
Practice on FleetCodePractice this problem
Open in Editor