Table: Product
+--------------+---------+ | Column Name | Type | +--------------+---------+ | product_id | int | | product_name | varchar | | unit_price | int | +--------------+---------+ product_id is the primary key (column with unique values) of this table. Each row of this table indicates the name and the price of each product.
Table: Sales
+-------------+---------+ | Column Name | Type | +-------------+---------+ | seller_id | int | | product_id | int | | buyer_id | int | | sale_date | date | | quantity | int | | price | int | +-------------+---------+ This table can have duplicate rows. product_id is a foreign key (reference column) to the Product table. Each row of this table contains some information about one sale.
Write a solution to report the products that were only sold in the first quarter of 2019. That is, between 2019-01-01 and 2019-03-31 inclusive.
Return the result table in any order.
The result format is in the following example.
Example 1:
Input: Product table: +------------+--------------+------------+ | product_id | product_name | unit_price | +------------+--------------+------------+ | 1 | S8 | 1000 | | 2 | G4 | 800 | | 3 | iPhone | 1400 | +------------+--------------+------------+ Sales table: +-----------+------------+----------+------------+----------+-------+ | seller_id | product_id | buyer_id | sale_date | quantity | price | +-----------+------------+----------+------------+----------+-------+ | 1 | 1 | 1 | 2019-01-21 | 2 | 2000 | | 1 | 2 | 2 | 2019-02-17 | 1 | 800 | | 2 | 2 | 3 | 2019-06-02 | 1 | 800 | | 3 | 3 | 4 | 2019-05-13 | 2 | 2800 | +-----------+------------+----------+------------+----------+-------+ Output: +-------------+--------------+ | product_id | product_name | +-------------+--------------+ | 1 | S8 | +-------------+--------------+ Explanation: The product with id 1 was only sold in the spring of 2019. The product with id 2 was sold in the spring of 2019 but was also sold after the spring of 2019. The product with id 3 was sold after spring 2019. We return only product 1 as it is the product that was only sold in the spring of 2019.
To solve Sales Analysis III, the goal is to identify products that were sold only during the first quarter of 2019 (January to March) and not at any other time. The main idea is to analyze the sale dates associated with each product and ensure all of them fall within the required range.
A common strategy is to group sales by product_id and check the boundaries of their sale dates. By computing the MIN(sale_date) and MAX(sale_date) for each product, you can verify whether all transactions occurred between 2019-01-01 and 2019-03-31. If both conditions are satisfied, the product qualifies.
Another approach is using a NOT EXISTS or filtering condition to exclude products that have any sales outside the Q1 2019 range. After identifying valid product IDs, join them with the Product table to retrieve the product name. Since the query mainly scans and groups sales records, the overall time complexity is proportional to the number of rows in the sales table.
| Approach | Time Complexity | Space Complexity |
|---|---|---|
| Group By with MIN/MAX date filtering | O(n) | O(1) |
| NOT EXISTS / exclusion of out-of-range sales | O(n) | O(1) |
Learn With Chirag
This approach involves using SQL to join the Product and Sales tables on the product_id column. We filter the sales to include only those made in the first quarter of 2019 and use GROUP BY to ensure the product was sold only during this period. We left join again to ensure there are no sales for these products outside this period.
Time Complexity: O(n + m), where n is the number of rows in the Product table and m is the number in the Sales table. Space Complexity: O(n + m) for storing filter results.
1import pandas as pd
2
3def sales_analysis(product: pd.DataFrame, sales: pd.DataFrame) -> pd.DataFrame:
4 first_quarter_sales = sales[(sales['sale_date'] >= '2019-01-01') & (sales['sale_date'] <= '2019-03-31')]
5 other_sales = sales[(sales['sale_date'] < '2019-01-01') | (sales['sale_date'] > '2019-03-31')]
6 first_quarter_product_ids = set(first_quarter_sales['product_id'])
7 other_product_ids = set(other_sales['product_id'])
8 only_first_quarter_product_ids = first_quarter_product_ids - other_product_ids
9 result = product[product['product_id'].isin(only_first_quarter_product_ids)][['product_id', 'product_name']]
10 return resultThis Python solution uses Panda's DataFrame to filter sales within the first quarter of 2019 and excludes any products sold outside this period. It then finds products only sold in the specified period by subtracting sets of product IDs.
This approach involves using a subquery to find product IDs that have sales exclusively in the first quarter of 2019. We first identify sales in this period and ensure there are no other sales for these products outside this period using NOT EXISTS with filtered sales.
Time Complexity: O(n*m), Space Complexity: O(1) when using database indexing.
1SELECT p.product_id, p.product_name
2FROM Product p
3WHERE EXISTS (
4Watch expert explanations and walkthroughs
Practice problems asked by these companies to ace your technical interviews.
Explore More ProblemsJot down your thoughts, approach, and key learnings
Yes, similar SQL problems are common in coding interviews, especially for data or backend roles. They test understanding of joins, grouping, date filtering, and writing clean SQL queries to enforce conditions across grouped records.
The optimal approach is to group sales by product_id and verify that the earliest and latest sale dates fall within Q1 2019. Using SQL functions like MIN(sale_date) and MAX(sale_date) ensures that no sales exist outside the required range. This approach keeps the query simple and efficient.
Yes, it can also be solved using a NOT EXISTS or subquery approach. In this method, you select products whose sales do not include any records outside the Q1 2019 date range, effectively filtering out invalid products.
The problem mainly relies on SQL aggregation and filtering. Using GROUP BY along with aggregate functions like MIN and MAX helps determine whether all sales for a product fall within a specific date range.
This SQL solution uses a subquery to verify that the product was sold in Q1 2019 and ensures no sales exist for the product outside this period.