Watch 4 video solutions for Total Sales Amount by Year, a hard level problem involving Database. This walkthrough by Code with Carter has 1,495 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.
Table: Product
+---------------+---------+ | Column Name | Type | +---------------+---------+ | product_id | int | | product_name | varchar | +---------------+---------+ product_id is the primary key (column with unique values) for this table. product_name is the name of the product.
Table: Sales
+---------------------+---------+ | Column Name | Type | +---------------------+---------+ | product_id | int | | period_start | date | | period_end | date | | average_daily_sales | int | +---------------------+---------+ product_id is the primary key (column with unique values) for this table. period_start and period_end indicate the start and end date for the sales period, and both dates are inclusive. The average_daily_sales column holds the average daily sales amount of the items for the period. The dates of the sales years are between 2018 to 2020.
Write a solution to report the total sales amount of each item for each year, with corresponding product_name, product_id, report_year, and total_amount.
Return the result table ordered by product_id and report_year.
The result format is in the following example.
Example 1:
Input: Product table: +------------+--------------+ | product_id | product_name | +------------+--------------+ | 1 | LC Phone | | 2 | LC T-Shirt | | 3 | LC Keychain | +------------+--------------+ Sales table: +------------+--------------+-------------+---------------------+ | product_id | period_start | period_end | average_daily_sales | +------------+--------------+-------------+---------------------+ | 1 | 2019-01-25 | 2019-02-28 | 100 | | 2 | 2018-12-01 | 2020-01-01 | 10 | | 3 | 2019-12-01 | 2020-01-31 | 1 | +------------+--------------+-------------+---------------------+ Output: +------------+--------------+-------------+--------------+ | product_id | product_name | report_year | total_amount | +------------+--------------+-------------+--------------+ | 1 | LC Phone | 2019 | 3500 | | 2 | LC T-Shirt | 2018 | 310 | | 2 | LC T-Shirt | 2019 | 3650 | | 2 | LC T-Shirt | 2020 | 10 | | 3 | LC Keychain | 2019 | 31 | | 3 | LC Keychain | 2020 | 31 | +------------+--------------+-------------+--------------+ Explanation: LC Phone was sold for the period of 2019-01-25 to 2019-02-28, and there are 35 days for this period. Total amount 35*100 = 3500. LC T-shirt was sold for the period of 2018-12-01 to 2020-01-01, and there are 31, 365, 1 days for years 2018, 2019 and 2020 respectively. LC Keychain was sold for the period of 2019-12-01 to 2020-01-31, and there are 31, 31 days for years 2019 and 2020 respectively.
Problem Overview: You are given product information and sales periods with an average_daily_sales. Each sales record spans a date range, but the report must show total sales per product per year. The challenge is splitting a single date range across multiple years and calculating the exact number of overlapping days for each year.
Approach 1: Year Expansion + Date Overlap Calculation (O(n * y) time, O(y) space)
The practical SQL solution expands each sales record across the years it overlaps, then calculates how many days of that record fall inside each year. You create a small derived table containing all report years (for example 2018–2020). For every sale record, join it with those years where the sale period intersects the year. The overlap days are computed using LEAST() and GREATEST() to clamp the range inside the year boundaries. Multiply the overlapping day count by average_daily_sales and aggregate using SUM(). This pattern is common in SQL problems involving time intervals and partial overlaps.
The key insight is converting a continuous date range into discrete yearly segments. Instead of iterating day by day, you compute the intersection between two ranges: [period_start, period_end] and [year_start, year_end]. That intersection length gives the exact contribution of the sales record to that year. A final GROUP BY product_id, report_year produces the required totals.
Approach 2: Recursive CTE Year Generation (O(n * y) time, O(y) space)
If the year range is not fixed, a recursive CTE can generate all years between the minimum period_start and maximum period_end. Each recursive step increments the year until the upper bound is reached. After generating the year table, the same date-overlap logic applies. This avoids hardcoding report years and adapts automatically to new data ranges.
This method is slightly heavier but more flexible. Many database interview problems expect candidates to handle dynamic ranges rather than relying on static values. The rest of the query still uses joins and aggregation to compute yearly totals.
Recommended for interviews: The year-expansion join with overlap calculation is the most common and readable solution. It demonstrates strong understanding of SQL date arithmetic, range intersection, and aggregation. The recursive CTE variant shows deeper SQL knowledge and is useful when the report years cannot be predefined.
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Year Expansion + Date Overlap | O(n * y) | O(y) | Best for fixed reporting years and most interview scenarios |
| Recursive CTE Year Generation | O(n * y) | O(y) | Useful when the year range must be derived dynamically from the dataset |