Watch the video solution for Fix Product Name Format, a easy level problem involving Database. This walkthrough by Everyday Data Science has 1,974 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.
Table: Sales
+--------------+---------+ | Column Name | Type | +--------------+---------+ | sale_id | int | | product_name | varchar | | sale_date | date | +--------------+---------+ sale_id is the column with unique values for this table. Each row of this table contains the product name and the date it was sold.
Since table Sales was filled manually in the year 2000, product_name may contain leading and/or trailing white spaces, also they are case-insensitive.
Write a solution to report
product_name in lowercase without leading or trailing white spaces.sale_date in the format ('YYYY-MM').total the number of times the product was sold in this month.Return the result table ordered by product_name in ascending order. In case of a tie, order it by sale_date in ascending order.
The result format is in the following example.
Example 1:
Input: Sales table: +---------+--------------+------------+ | sale_id | product_name | sale_date | +---------+--------------+------------+ | 1 | LCPHONE | 2000-01-16 | | 2 | LCPhone | 2000-01-17 | | 3 | LcPhOnE | 2000-02-18 | | 4 | LCKeyCHAiN | 2000-02-19 | | 5 | LCKeyChain | 2000-02-28 | | 6 | Matryoshka | 2000-03-31 | +---------+--------------+------------+ Output: +--------------+-----------+-------+ | product_name | sale_date | total | +--------------+-----------+-------+ | lckeychain | 2000-02 | 2 | | lcphone | 2000-01 | 2 | | lcphone | 2000-02 | 1 | | matryoshka | 2000-03 | 1 | +--------------+-----------+-------+ Explanation: In January, 2 LcPhones were sold. Please note that the product names are not case sensitive and may contain spaces. In February, 2 LCKeychains and 1 LCPhone were sold. In March, one matryoshka was sold.
Problem Overview: The Fix Product Name Format problem asks you to normalize inconsistent product names and aggregate sales by month. Product names may contain uppercase letters or extra spaces, and the sale date must be formatted as YYYY-MM. After cleaning the data, you group by the normalized product name and month, then count how many sales occurred.
Approach 1: Direct Aggregation Without Normalization (O(n) time, O(1) space)
The most naive approach simply groups by the raw product_name and formatted month from sale_date. You can use DATE_FORMAT(sale_date, '%Y-%m') and aggregate with COUNT(*). This runs in O(n) time because the database scans all rows once during grouping. The problem is data quality: names like 'iPhone', 'iphone', or ' iphone ' are treated as different groups. This produces incorrect aggregates when the same product appears in multiple formats.
Approach 2: Normalize Name + Monthly Aggregation (O(n) time, O(1) space)
The correct solution standardizes the product name before aggregation. Use TRIM() to remove leading and trailing spaces and LOWER() to convert the string to lowercase. Then format the date using DATE_FORMAT(sale_date, '%Y-%m'). After normalization, group by these cleaned values and compute the count with COUNT(*). The database still performs a single scan with grouping, so the time complexity remains O(n) and the extra space is constant outside the aggregation process.
This pattern appears frequently in SQL and database interview questions. Real production datasets often contain inconsistent casing or whitespace. Normalizing strings before aggregation ensures logically identical values collapse into the same group. SQL string functions like LOWER, TRIM, and date formatting utilities are the core tools.
Recommended for interviews: Interviewers expect the normalization approach. Showing the naive aggregation first demonstrates you understand grouping logic, but the correct solution proves you think about messy real-world data. Combining string cleanup with aggregation is a common pattern in SQL problem solving.
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Direct Aggregation | O(n) | O(1) | Quick grouping when data is already clean and consistently formatted |
| Normalize Name + Aggregate (LOWER, TRIM, DATE_FORMAT) | O(n) | O(1) | Best choice when product names contain inconsistent casing or extra whitespace |