Watch 10 video solutions for Daily Leads and Partners, a easy level problem involving Database. This walkthrough by Everyday Data Science has 5,054 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.
Table: DailySales
+-------------+---------+ | Column Name | Type | +-------------+---------+ | date_id | date | | make_name | varchar | | lead_id | int | | partner_id | int | +-------------+---------+ There is no primary key (column with unique values) for this table. It may contain duplicates. This table contains the date and the name of the product sold and the IDs of the lead and partner it was sold to. The name consists of only lowercase English letters.
For each date_id and make_name, find the number of distinct lead_id's and distinct partner_id's.
Return the result table in any order.
The result format is in the following example.
Example 1:
Input: DailySales table: +-----------+-----------+---------+------------+ | date_id | make_name | lead_id | partner_id | +-----------+-----------+---------+------------+ | 2020-12-8 | toyota | 0 | 1 | | 2020-12-8 | toyota | 1 | 0 | | 2020-12-8 | toyota | 1 | 2 | | 2020-12-7 | toyota | 0 | 2 | | 2020-12-7 | toyota | 0 | 1 | | 2020-12-8 | honda | 1 | 2 | | 2020-12-8 | honda | 2 | 1 | | 2020-12-7 | honda | 0 | 1 | | 2020-12-7 | honda | 1 | 2 | | 2020-12-7 | honda | 2 | 1 | +-----------+-----------+---------+------------+ Output: +-----------+-----------+--------------+-----------------+ | date_id | make_name | unique_leads | unique_partners | +-----------+-----------+--------------+-----------------+ | 2020-12-8 | toyota | 2 | 3 | | 2020-12-7 | toyota | 1 | 2 | | 2020-12-8 | honda | 2 | 2 | | 2020-12-7 | honda | 3 | 2 | +-----------+-----------+--------------+-----------------+ Explanation: For 2020-12-8, toyota gets leads = [0, 1] and partners = [0, 1, 2] while honda gets leads = [1, 2] and partners = [1, 2]. For 2020-12-7, toyota gets leads = [0] and partners = [1, 2] while honda gets leads = [0, 1, 2] and partners = [1, 2].
Problem Overview: You receive a sales log where each row contains date_id, make_name, lead_id, and partner_id. The task is to aggregate this data so that for every unique pair of date_id and make_name, you return the number of distinct leads and distinct partners.
Approach 1: Group By with Set Aggregation (O(n) time, O(n) space)
This approach simulates SQL aggregation using in-memory data structures. Iterate through every record and group rows by the key (date_id, make_name) using a hash map. For each group, maintain two sets: one for lead_id values and another for partner_id. Inserting into a set automatically removes duplicates, so the final counts are simply the sizes of these sets. This approach works well when solving the problem in languages like Python or JavaScript outside a database environment. It relies on hash table lookups and set operations to keep the solution linear.
Approach 2: SQL Group By and Count Distinct (O(n) time, O(1) extra space)
Databases already provide optimized aggregation primitives, making SQL the cleanest solution. Group the rows by date_id and make_name, then compute the number of unique identifiers using COUNT(DISTINCT lead_id) and COUNT(DISTINCT partner_id). The database engine performs grouping and deduplication internally using optimized indexing and aggregation algorithms. This approach is standard for SQL and database interview questions because it expresses the entire solution in a single query.
Recommended for interviews: The SQL GROUP BY with COUNT(DISTINCT) is the expected solution since the problem is fundamentally a database aggregation task. Demonstrating the set-based grouping approach in a general-purpose language shows that you understand how SQL aggregation works internally, but the SQL query is what interviewers typically look for.
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Group By with Set Aggregation | O(n) | O(n) | When solving outside SQL using Python or JavaScript and simulating aggregation with hash maps and sets |
| SQL GROUP BY with COUNT DISTINCT | O(n) | O(1) extra | Best choice for database queries and SQL interviews where aggregation is handled by the database engine |