#295 Find Median from Data Stream - Solution

Dashboard Questions Topics Companies Sheets

The median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value, and the median is the mean of the two middle values.

For example, for arr = [2,3,4], the median is 3.
For example, for arr = [2,3], the median is (2 + 3) / 2 = 2.5.

Implement the MedianFinder class:

MedianFinder() initializes the MedianFinder object.
void addNum(int num) adds the integer num from the data stream to the data structure.
double findMedian() returns the median of all elements so far. Answers within 10^-5 of the actual answer will be accepted.

Example 1:

Input
["MedianFinder", "addNum", "addNum", "findMedian", "addNum", "findMedian"]
[[], [1], [2], [], [3], []]
Output
[null, null, null, 1.5, null, 2.0]

Explanation
MedianFinder medianFinder = new MedianFinder();
medianFinder.addNum(1);    // arr = [1]
medianFinder.addNum(2);    // arr = [1, 2]
medianFinder.findMedian(); // return 1.5 (i.e., (1 + 2) / 2)
medianFinder.addNum(3);    // arr[1, 2, 3]
medianFinder.findMedian(); // return 2.0

Constraints:

-10⁵ <= num <= 10⁵
There will be at least one element in the data structure before calling findMedian.
At most 5 * 10⁴ calls will be made to addNum and findMedian.

Follow up:

If all integer numbers from the stream are in the range [0, 100], how would you optimize your solution?
If 99% of all integer numbers from the stream are in the range [0, 100], how would you optimize your solution?

Approach	Time Complexity	Space Complexity
Sorting after each insertion	O(n log n) per operation	O(n)
Two Heaps (Max-Heap + Min-Heap)	O(log n) insertion, O(1) median retrieval	O(n)

Solutions (5)

Using Two Heaps (Min-Heap and Max-Heap)

The main idea is to use a max-heap to store the smaller half of the numbers and a min-heap to store the larger half. The max-heap allows you to efficiently fetch the largest element of the smaller half, and the min-heap allows you to efficiently fetch the smallest element of the larger half. To maintain the balance, ensure that the number of elements in the max-heap is either equal to or one more than the number of elements in the min-heap.

The time complexity for adding a number is O(log n) due to heap adjustments. Finding the median has a time complexity of O(1). Space complexity is O(n), where n is the number of elements.

Python C++Java JavaScript

1import java.util.PriorityQueue;
2
3class MedianFinder {
4    private PriorityQueue<Integer> small;
5    private PriorityQueue<Integer> large;
6
7    public MedianFinder() {
8        small = new PriorityQueue<>((a, b) -> b - a);
9        large = new PriorityQueue<>();
10    }
11
12    public void addNum(int num) {
13        small.add(num);
14        large.add(small.poll());
15        if (small.size() < large.size()) {
16            small.add(large.poll());
17        }
18    }
19
20    public double findMedian() {
21        if (small.size() > large.size()) {
22            return small.peek();
23        }
24        return (small.peek() + large.peek()) / 2.0;
25    }
26}

Explanation

This implementation uses Java's PriorityQueue for max and min heaps. After each insertion, elements are adjusted between two heaps to keep balance. The median is derived from the heap's top elements based on their sizes.

Using a Balanced Binary Search Tree

An alternative approach is to use a balanced binary search tree (BST) to keep track of the numbers. This allows maintaining a sorted order of elements, which can make finding the median efficient since a BST can give average-case time complexity of O(log n) for insertion and O(1) for median finding.

Insertion and balancing in the sorted set take O(log n) time, and findings of the median are O(1). Space complexity is O(n).

1using System;
2using System.Collections.Generic;
3
4public class MedianFinder {
5    private SortedSet<(int, int)> small;
6    private SortedSet<(int, int)> large;
7    private int counter = 0;

    public MedianFinder() {
        small = new SortedSet<(int, int)>();
        large = new SortedSet<(int, int)>();
    }

    public void AddNum(int num) {
        small.Add((num, counter++));
        large.Add(small.Max);
        small.Remove(small.Max);
        if (large.Count > small.Count) {
            small.Add(large.Min);
            large.Remove(large.Min);
        }
    }

    public double FindMedian() {
        if (small.Count > large.Count) {
            return small.Max.Item1;
        }
        return (small.Max.Item1 + large.Min.Item1) / 2.0;
    }
}