Watch the video solution for Customer Purchasing Behavior Analysis, a medium level problem involving Database. This walkthrough by Everyday Data Science has 794 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.
Table: Transactions
+------------------+---------+ | Column Name | Type | +------------------+---------+ | transaction_id | int | | customer_id | int | | product_id | int | | transaction_date | date | | amount | decimal | +------------------+---------+ transaction_id is the unique identifier for this table. Each row of this table contains information about a transaction, including the customer ID, product ID, date, and amount spent.
Table: Products
+-------------+---------+ | Column Name | Type | +-------------+---------+ | product_id | int | | category | varchar | | price | decimal | +-------------+---------+ product_id is the unique identifier for this table. Each row of this table contains information about a product, including its category and price.
Write a solution to analyze customer purchasing behavior. For each customer, calculate:
Round total_amount, avg_transaction_amount, and loyalty_score to 2 decimal places.
Return the result table ordered by loyalty_score in descending order, then by customer_id in ascending order.
The query result format is in the following example.
Example:
Input:
Transactions table:
+----------------+-------------+------------+------------------+--------+ | transaction_id | customer_id | product_id | transaction_date | amount | +----------------+-------------+------------+------------------+--------+ | 1 | 101 | 1 | 2023-01-01 | 100.00 | | 2 | 101 | 2 | 2023-01-15 | 150.00 | | 3 | 102 | 1 | 2023-01-01 | 100.00 | | 4 | 102 | 3 | 2023-01-22 | 200.00 | | 5 | 101 | 3 | 2023-02-10 | 200.00 | +----------------+-------------+------------+------------------+--------+
Products table:
+------------+----------+--------+ | product_id | category | price | +------------+----------+--------+ | 1 | A | 100.00 | | 2 | B | 150.00 | | 3 | C | 200.00 | +------------+----------+--------+
Output:
+-------------+--------------+-------------------+-------------------+------------------------+--------------+---------------+ | customer_id | total_amount | transaction_count | unique_categories | avg_transaction_amount | top_category | loyalty_score | +-------------+--------------+-------------------+-------------------+------------------------+--------------+---------------+ | 101 | 450.00 | 3 | 3 | 150.00 | C | 34.50 | | 102 | 300.00 | 2 | 2 | 150.00 | C | 23.00 | +-------------+--------------+-------------------+-------------------+------------------------+--------------+---------------+
Explanation:
Note: The output is ordered by loyalty_score in descending order, then by customer_id in ascending order.
Problem Overview: You need to analyze purchasing activity across customers and determine behavioral patterns from transactional data. The task usually involves aggregating purchases per customer, comparing them within a group, and returning rows that satisfy ranking or behavioral conditions.
Approach 1: Basic Aggregation with GROUP BY (O(n) time, O(1) extra space)
The first step most engineers try is aggregating purchases using GROUP BY. You compute metrics like total orders, total spend, or purchase counts per customer. This works when the problem only asks for simple summaries. However, it breaks down when you must compare rows within the same customer group or find patterns such as "top purchase", "most recent transaction", or ranked activity. Basic aggregation collapses rows and loses row-level detail, so additional logic becomes difficult.
Approach 2: Grouping + Window Functions + Join (O(n log n) time, O(n) space)
The production-ready solution combines aggregation, window functions, and joins. First, compute per-customer metrics using GROUP BY. Then apply a window function like ROW_NUMBER(), RANK(), or DENSE_RANK() with PARTITION BY customer_id. This keeps row-level data while also allowing comparisons inside each customer partition. Window operations internally require sorting within each partition, which typically leads to O(n log n) time complexity.
After computing rankings or behavioral indicators, join the derived result back to the original table or filtered aggregates. The join step retrieves only the rows that match the required behavior (for example, the highest purchase, most recent order, or qualifying transaction pattern). This structure is flexible and readable, which is why it appears frequently in real-world analytics queries.
This approach relies heavily on SQL analytics features such as PARTITION BY and ordered windows. If you want to deepen these concepts, review SQL Window Functions, GROUP BY aggregation, and SQL Joins. Understanding how these three pieces interact is key to solving most database interview questions.
Recommended for interviews: The grouping + window function approach is the expected solution. Starting with a basic aggregation demonstrates you understand the dataset and metrics. Moving to window functions shows stronger SQL skills because you preserve row-level detail while computing ranked analytics. Interviewers typically look for correct partitioning logic, proper ordering inside the window function, and a clean join or filter that returns the final result.
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Basic Aggregation with GROUP BY | O(n) | O(1) | When only summary metrics per customer are required without row-level comparison |
| Grouping + Window Functions + Join | O(n log n) | O(n) | When you must rank, compare, or filter transactions within each customer partition |