Find Median Given Frequency of Numbers - Solution & Explanation

HardPremiumFree on FleetCodeDatabase3 min readAsked at: Pinterest

Problem Statement

Table: Numbers

+-------------+------+
| Column Name | Type |
+-------------+------+
| num         | int  |
| frequency   | int  |
+-------------+------+
num is the primary key (column with unique values) for this table.
Each row of this table shows the frequency of a number in the database.

The median is the value separating the higher half from the lower half of a data sample.

Write a solution to report the median of all the numbers in the database after decompressing the Numbers table. Round the median to one decimal point.

The result format is in the following example.

Example 1:

Input: 
Numbers table:
+-----+-----------+
| num | frequency |
+-----+-----------+
| 0   | 7         |
| 1   | 1         |
| 2   | 3         |
| 3   | 1         |
+-----+-----------+
Output: 
+--------+
| median |
+--------+
| 0.0    |
+--------+
Explanation: 
If we decompress the Numbers table, we will get [0, 0, 0, 0, 0, 0, 0, 1, 2, 2, 2, 3], so the median is (0 + 0) / 2 = 0.

Approach Overview

Problem Overview: The table stores numbers along with how many times each value appears. Instead of expanding all values, compute the median directly from the frequency distribution.

Approach 1: Expand Frequencies into Rows (Brute Force) (Time: O(n + F), Space: O(F))

The most straightforward idea is to reconstruct the full dataset. For each row (num, frequency), generate frequency copies of num, then sort the expanded list and compute the median normally. In SQL this can be simulated using recursive queries or helper number tables. After expansion, the median is the middle element if the total count is odd, or the average of the two middle elements if the count is even.

This method mirrors how you would solve the problem in memory using arrays, but it becomes expensive when frequencies are large. The dataset size becomes the sum of all frequencies, which can explode quickly. It’s useful conceptually but rarely practical in a production database environment.

Approach 2: Cumulative Frequency with Window Functions (Optimal SQL) (Time: O(n log n), Space: O(n))

The key insight: the median depends on position, not the fully expanded data. First compute the total count of elements using SUM(frequency). The median positions are (total + 1) / 2 and (total + 2) / 2. For odd counts these two positions are identical; for even counts they represent the two middle elements whose average forms the median.

Next, sort rows by num and compute a running cumulative frequency using SQL window functions. This cumulative value tells you the index range that each number occupies in the conceptual expanded array. For example, if cumulative frequency jumps from 5 to 9, that number covers positions 6–9.

Once you know those ranges, simply select rows where the cumulative frequency interval contains either median position. Those numbers are the median candidates. Averaging them produces the correct result for both even and odd totals. The technique behaves like a prefix sum over frequencies and avoids materializing the full dataset.

Recommended for interviews: Interviewers expect the cumulative frequency idea. The brute-force expansion shows basic understanding of medians, but the window-function approach demonstrates database reasoning and the ability to transform frequency distributions into positional ranges efficiently.

Solution

Code

MySQL

Try this approach in the editor →

Detailed Complexity Analysis

Approach	Time	Space	When to Use
Expand Frequencies into Rows	O(n + F)	O(F)	When frequencies are very small and clarity is more important than efficiency
Cumulative Frequency with Window Functions	O(n log n)	O(n)	Best SQL approach for large datasets without expanding rows
Prefix Sum Range Detection	O(n log n)	O(n)	General technique when computing statistics from frequency tables

Video Solution

Leetcode HARD 571 - Decompressing Tables with Recursive CTE - Find Median | Everyday Data Science • Everyday Data Science • 919 views views

Watch 6 more video solutions →

Frequently Asked Questions

Is Find Median Given Frequency of Numbers easy or hard?

This problem is considered hard because it requires reasoning about medians without explicitly building the dataset. The challenge is translating frequency counts into positional ranges using prefix sums or SQL window functions.

Find Median Given Frequency of Numbers Python/Java solution

In Python or Java, the typical approach computes the total frequency, determines the two median indices, and iterates through numbers while maintaining a cumulative count. When the cumulative count crosses the median positions, those numbers determine the median value.

How to solve Find Median Given Frequency of Numbers in O(n)?

If the numbers are already sorted, you can compute cumulative frequencies in a single pass and detect when the running total crosses the median positions. That reduces the scan to O(n). In SQL, sorting is usually required first, resulting in O(n log n) complexity.

What is the best approach for Find Median Given Frequency of Numbers?

The best approach uses cumulative frequencies with SQL window functions. Sort numbers, compute a running SUM(frequency), and identify where the median positions fall within the cumulative range. This avoids expanding rows and runs in O(n log n) time with O(n) space.

Is Find Median Given Frequency of Numbers asked at Google/Amazon/Meta?

Median and percentile calculations from frequency distributions appear in data engineering and analytics interviews at companies like Google, Amazon, and Meta. Variants test SQL window functions, prefix sums, and statistical queries on aggregated datasets.

What data structure is used in Find Median Given Frequency of Numbers?

The solution relies on a frequency table combined with prefix sums implemented through SQL window functions. Conceptually it maps frequencies to index ranges in a sorted array without actually storing the expanded list.

What is the time complexity of Find Median Given Frequency of Numbers?

The optimal SQL solution runs in O(n log n) time due to sorting by the number column before computing cumulative frequencies. Space complexity is O(n) for the intermediate window function results. A brute-force expansion approach can degrade to O(F) where F is the total frequency count.

Ready to solve this problem?

Practice Find Median Given Frequency of Numbers with our built-in code editor and test cases.

Practice on FleetCode

Median Employee Salary

Problem Info

DifficultyHard

Acceptance42.4%

Approaches1

Reading time3 min

Asked at

Practice this problem

Open in Editor

Find Median Given Frequency of Numbers - Solution & Explanation

Problem Statement

Approach Overview

Solution

Code

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Ready to solve this problem?

Problem Info

Table of Contents

Find Median Given Frequency of Numbers - Solution & Explanation

Problem Statement

Approach Overview

Solution

Code

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Ready to solve this problem?

Problem Info

Table of Contents

Problem Statement

Approach Overview

Solution

Code

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Related Problems

Ready to solve this problem?

Problem Info

Table of Contents

Problem Statement

Approach Overview

Solution

Code

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Related Problems

Ready to solve this problem?

Problem Info

Table of Contents