Watch 2 video solutions for Orders With Maximum Quantity Above Average, a medium level problem involving Database. This walkthrough by Everyday Data Science has 3,270 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.
Table: OrdersDetails
+-------------+------+ | Column Name | Type | +-------------+------+ | order_id | int | | product_id | int | | quantity | int | +-------------+------+ (order_id, product_id) is the primary key (combination of columns with unique values) for this table. A single order is represented as multiple rows, one row for each product in the order. Each row of this table contains the quantity ordered of the product product_id in the order order_id.
You are running an e-commerce site that is looking for imbalanced orders. An imbalanced order is one whose maximum quantity is strictly greater than the average quantity of every order (including itself).
The average quantity of an order is calculated as (total quantity of all products in the order) / (number of different products in the order). The maximum quantity of an order is the highest quantity of any single product in the order.
Write a solution to find the order_id of all imbalanced orders.
Return the result table in any order.
The result format is in the following example.
Example 1:
Input: OrdersDetails table: +----------+------------+----------+ | order_id | product_id | quantity | +----------+------------+----------+ | 1 | 1 | 12 | | 1 | 2 | 10 | | 1 | 3 | 15 | | 2 | 1 | 8 | | 2 | 4 | 4 | | 2 | 5 | 6 | | 3 | 3 | 5 | | 3 | 4 | 18 | | 4 | 5 | 2 | | 4 | 6 | 8 | | 5 | 7 | 9 | | 5 | 8 | 9 | | 3 | 9 | 20 | | 2 | 9 | 4 | +----------+------------+----------+ Output: +----------+ | order_id | +----------+ | 1 | | 3 | +----------+ Explanation: The average quantity of each order is: - order_id=1: (12+10+15)/3 = 12.3333333 - order_id=2: (8+4+6+4)/4 = 5.5 - order_id=3: (5+18+20)/3 = 14.333333 - order_id=4: (2+8)/2 = 5 - order_id=5: (9+9)/2 = 9 The maximum quantity of each order is: - order_id=1: max(12, 10, 15) = 15 - order_id=2: max(8, 4, 6, 4) = 8 - order_id=3: max(5, 18, 20) = 20 - order_id=4: max(2, 8) = 8 - order_id=5: max(9, 9) = 9 Orders 1 and 3 are imbalanced because they have a maximum quantity that exceeds the average quantity of every order.
Problem Overview: You are given an Orders table containing order_id, product_id, and quantity. The goal is to return all order_id values where the maximum quantity ordered within that order is greater than the overall average quantity across the table.
Approach 1: Aggregation with Subquery (O(n) time, O(1) extra space)
This approach relies on SQL aggregation. First compute the global average quantity using a subquery: SELECT AVG(quantity) FROM Orders. Then group rows by order_id and calculate MAX(quantity) for each order. The HAVING clause filters only those groups where the maximum quantity exceeds the computed average. This works because HAVING operates after aggregation, letting you compare aggregated values directly against the subquery result.
The database performs a full scan to evaluate the aggregate functions. For each order group, it calculates the maximum quantity and checks the condition against the global average. This solution is concise and leverages native SQL aggregation primitives such as GROUP BY, MAX(), and AVG(). These operations are core patterns in database queries and frequently appear in analytical SQL tasks.
If the table is indexed by order_id, the grouping step becomes more efficient since the database can cluster rows belonging to the same order. Even without indexes, the query remains linear with respect to the number of rows. The key insight is recognizing that the problem compares two aggregated values: the per‑order maximum and the overall average.
Conceptually, the query follows three steps: compute the average quantity, aggregate rows per order to find the maximum, and filter using HAVING. This pattern is common in problems involving SQL aggregation and conditional filtering over grouped data. Understanding how GROUP BY interacts with aggregate functions is critical for solving many database interview questions.
Recommended for interviews: The aggregation + subquery solution is exactly what interviewers expect. It demonstrates clear understanding of SQL grouping, aggregate functions, and the difference between WHERE and HAVING. A brute force procedural approach is unnecessary in SQL; using native aggregates shows strong query design skills.
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Aggregation with Subquery | O(n) | O(1) | General case; straightforward SQL solution using GROUP BY and HAVING |
| Derived Table + Aggregation | O(n) | O(n) | When you prefer computing averages in a temporary derived table for clarity |