Watch the video solution for Market Analysis III, a medium level problem involving Database. This walkthrough by Everyday Data Science has 501 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.
Table: Users
+----------------+---------+ | Column Name | Type | +----------------+---------+ | seller_id | int | | join_date | date | | favorite_brand | varchar | +----------------+---------+ seller_id is column of unique values for this table. This table contains seller id, join date, and favorite brand of sellers.
Table: Items
+---------------+---------+ | Column Name | Type | +---------------+---------+ | item_id | int | | item_brand | varchar | +---------------+---------+ item_id is the column of unique values for this table. This table contains item id and item brand.
Table: Orders
+---------------+---------+ | Column Name | Type | +---------------+---------+ | order_id | int | | order_date | date | | item_id | int | | seller_id | int | +---------------+---------+ order_id is the column of unique values for this table. item_id is a foreign key to the Items table. seller_id is a foreign key to the Users table. This table contains order id, order date, item id and seller id.
Write a solution to find the top seller who has sold the highest number of unique items with a different brand than their favorite brand. If there are multiple sellers with the same highest count, return all of them.
Return the result table ordered by seller_id in ascending order.
The result format is in the following example.
Example 1:
Input: Users table: +-----------+------------+----------------+ | seller_id | join_date | favorite_brand | +-----------+------------+----------------+ | 1 | 2019-01-01 | Lenovo | | 2 | 2019-02-09 | Samsung | | 3 | 2019-01-19 | LG | +-----------+------------+----------------+ Orders table: +----------+------------+---------+-----------+ | order_id | order_date | item_id | seller_id | +----------+------------+---------+-----------+ | 1 | 2019-08-01 | 4 | 2 | | 2 | 2019-08-02 | 2 | 3 | | 3 | 2019-08-03 | 3 | 3 | | 4 | 2019-08-04 | 1 | 2 | | 5 | 2019-08-04 | 4 | 2 | +----------+------------+---------+-----------+ Items table: +---------+------------+ | item_id | item_brand | +---------+------------+ | 1 | Samsung | | 2 | Lenovo | | 3 | LG | | 4 | HP | +---------+------------+ Output: +-----------+-----------+ | seller_id | num_items | +-----------+-----------+ | 2 | 1 | | 3 | 1 | +-----------+-----------+ Explanation: - The user with seller_id 2 has sold three items, but only two of them are not marked as a favorite. We will include a unique count of 1 because both of these items are identical. - The user with seller_id 3 has sold two items, but only one of them is not marked as a favorite. We will include just that non-favorite item in our count. Since seller_ids 2 and 3 have the same count of one item each, they both will be displayed in the output.
Problem Overview: The task asks you to analyze marketplace data and determine whether the second item sold by each seller matches their favorite brand. The data is spread across multiple tables (typically Users, Orders, and Items), so the solution relies on joining tables, identifying the second order for each seller, and comparing the item brand with the seller’s favorite brand.
Approach 1: Equijoin + Grouping + Subquery (O(n log n) time, O(n) space)
This approach joins the marketplace tables using equijoins to combine seller information, order history, and item brand data. First, create a subquery that determines the chronological order of items sold by each seller. This is typically done by grouping orders by seller_id and identifying the second order using an ordered condition (for example, filtering by the second smallest order_date or using ranking logic). Once the second order per seller is identified, join it with the Items table to fetch the corresponding item_brand.
Next, compare that brand with the seller’s favorite_brand from the Users table. The outer query performs the final projection, returning each seller and whether the brand of the second sold item matches their favorite brand. Sellers with fewer than two orders are handled naturally by the subquery logic and return a negative result. The heavy lifting is done by the subquery that isolates the second transaction for each seller.
The key insight is separating the problem into two phases: first identify the second order per seller, then evaluate the brand match. SQL grouping and filtering make this efficient without scanning the dataset multiple times. Since joins are performed on indexed keys like user_id, seller_id, and item_id, the query scales well for large datasets.
Conceptually, this solution uses relational operations such as SQL Join, grouping with aggregation via GROUP BY, and filtering with Subqueries. These patterns appear frequently in database interview questions where you must isolate a specific row within grouped records.
Recommended for interviews: Interviewers expect a clean SQL solution using joins and a subquery that isolates the second order per seller. A brute-force mindset—scanning all orders and manually filtering—shows the basic idea, but the grouped subquery demonstrates real SQL fluency and the ability to express relational logic concisely.
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Equijoin + Grouping + Subquery | O(n log n) | O(n) | General SQL solution when multiple tables must be combined and a specific ranked record per group is required |
| Join with Ordered Filtering | O(n log n) | O(n) | When order dates must be sorted to determine the second transaction per seller |