Active Businesses - Solution & Explanation

MediumPremiumFree on FleetCodeDatabase4 min readAsked at: Yelp

Problem Statement

Table: Events

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| business_id   | int     |
| event_type    | varchar |
| occurrences   | int     | 
+---------------+---------+
(business_id, event_type) is the primary key (combination of columns with unique values) of this table.
Each row in the table logs the info that an event of some type occurred at some business for a number of times.

The average activity for a particular event_type is the average occurrences across all companies that have this event.

An active business is a business that has more than one event_type such that their occurrences is strictly greater than the average activity for that event.

Write a solution to find all active businesses.

Return the result table in any order.

The result format is in the following example.

Example 1:

Input: 
Events table:
+-------------+------------+-------------+
| business_id | event_type | occurrences |
+-------------+------------+-------------+
| 1           | reviews    | 7           |
| 3           | reviews    | 3           |
| 1           | ads        | 11          |
| 2           | ads        | 7           |
| 3           | ads        | 6           |
| 1           | page views | 3           |
| 2           | page views | 12          |
+-------------+------------+-------------+
Output: 
+-------------+
| business_id |
+-------------+
| 1           |
+-------------+
Explanation:  
The average activity for each event can be calculated as follows:
- 'reviews': (7+3)/2 = 5
- 'ads': (11+7+6)/3 = 8
- 'page views': (3+12)/2 = 7.5
The business with id=1 has 7 'reviews' events (more than 5) and 11 'ads' events (more than 8), so it is an active business.

Approach Overview

Problem Overview: The table stores how many times each business triggered a specific event type. A business is considered active if its occurrence count is higher than the average for that event type in at least two different event categories. The result should return the IDs of such businesses.

Approach 1: Correlated Subquery Comparison (O(n^2) time, O(1) extra space)

A straightforward way compares each row's occurrences value with the average for the same event_type using a correlated subquery. For every record, compute AVG(occurrences) from rows sharing the same event type. If the row's value is greater than the computed average, count it toward that business. Finally, group by business_id and keep businesses where the count of above-average events is at least two.

This approach is easy to reason about but inefficient. The database repeatedly recomputes averages for each row, which leads to repeated scans of the same data. On large datasets the query planner may struggle to optimize it, resulting in near O(n^2) behavior.

Approach 2: Aggregation + Join (O(n) time, O(n) space)

A more efficient solution first calculates the average occurrences per event_type using a grouped subquery. This produces a small derived table mapping each event type to its average. Next, join this derived table back to the original Events table and filter rows where occurrences > avg_occurrences. Each remaining row represents an event where the business performed above the average.

Group these filtered rows by business_id and count how many qualifying event types each business has. Businesses with a count of at least two satisfy the problem requirement. This approach works efficiently because the average for each event type is computed once, then reused through a join.

This pattern—aggregate first, then filter with a join—is common in SQL and database interview problems. It separates the computation of global statistics (averages) from row-level filtering, which allows the optimizer to scan the data only once and leverage grouping efficiently.

Recommended for interviews: The aggregation + join approach is the expected solution. It demonstrates comfort with SQL aggregation, derived tables, and filtering grouped results using GROUP BY with HAVING. Mentioning the correlated subquery shows baseline understanding, but the optimized aggregation pattern reflects stronger SQL query design.

Solution

Code

MySQL

Try this approach in the editor →

Detailed Complexity Analysis

Approach	Time	Space	When to Use
Correlated Subquery Comparison	O(n^2)	O(1)	Quick prototype or small datasets where repeated average calculations are acceptable
Aggregation + Join (Derived Table)	O(n)	O(n)	Preferred approach for interviews and production queries with grouping and average comparisons

Video Solution

LeetCode Medium 1126 Yelp Interview SQL Question with Detailed Explanation • Everyday Data Science • 2,534 views views

Watch 1 more video solutions →

Frequently Asked Questions

Is Active Businesses easy or hard?

Active Businesses is considered Medium difficulty on LeetCode. The challenge is recognizing that you must compare each row to an aggregated average and then apply a second aggregation to count qualifying event types per business.

Active Businesses Python/Java solution

This problem is designed for SQL rather than general-purpose languages. The standard solution uses MySQL with a derived table that stores event_type averages, joins it with Events, and applies GROUP BY business_id with HAVING COUNT(*) >= 2.

How to solve Active Businesses in O(n)?

First compute AVG(occurrences) for each event_type using GROUP BY. Join this derived table back to Events and filter rows where occurrences > average. Finally group by business_id and keep businesses having COUNT(*) >= 2. Each step is based on linear scans and grouped operations, giving near O(n) performance.

What is the best approach for Active Businesses?

The best approach computes the average occurrences per event_type using GROUP BY, then joins this result with the original table. Filter rows where occurrences exceed the average and group by business_id. Businesses with at least two such events are active. This aggregation + join pattern runs in roughly O(n) time with one full scan and grouped computation.

Is Active Businesses asked at Google/Amazon/Meta?

SQL aggregation and filtering problems like Active Businesses frequently appear in database interview rounds at companies such as Amazon, Meta, and Google. They test understanding of GROUP BY, HAVING, and joins with derived tables rather than complex algorithms.

What data structure is used in Active Businesses?

The problem relies on relational database operations rather than traditional data structures. Conceptually it uses grouped aggregation, which behaves like mapping each event_type to an average value before filtering and counting results per business.

What is the time complexity of Active Businesses?

The optimized SQL solution runs in about O(n) time where n is the number of rows in the Events table. The database performs a grouped aggregation to compute averages and then a join plus grouping step to count qualifying events per business.

Ready to solve this problem?

Practice Active Businesses with our built-in code editor and test cases.

Practice on FleetCode

Combine Two Tables

Second Highest Salary

Problem Info

DifficultyMedium

Acceptance65.8%

Approaches1

Reading time4 min

Asked at

Yelp

Practice this problem

Open in Editor

Active Businesses - Solution & Explanation

Problem Statement

Approach Overview

Solution

Code

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Ready to solve this problem?

Problem Info

Table of Contents

Active Businesses - Solution & Explanation

Problem Statement

Approach Overview

Solution

Code

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Ready to solve this problem?

Problem Info

Table of Contents

Problem Statement

Approach Overview

Solution

Code

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Related Problems

Ready to solve this problem?

Problem Info

Table of Contents

Problem Statement

Approach Overview

Solution

Code

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Related Problems

Ready to solve this problem?

Problem Info

Table of Contents