Watch 10 video solutions for Reduce Array Size to The Half, a medium level problem involving Array, Hash Table, Greedy. This walkthrough by Fraz has 10,495 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.
You are given an integer array arr. You can choose a set of integers and remove all the occurrences of these integers in the array.
Return the minimum size of the set so that at least half of the integers of the array are removed.
Example 1:
Input: arr = [3,3,3,3,5,5,5,2,2,7]
Output: 2
Explanation: Choosing {3,7} will make the new array [5,5,5,2,2] which has size 5 (i.e equal to half of the size of the old array).
Possible sets of size 2 are {3,5},{3,2},{5,2}.
Choosing set {2,7} is not possible as it will make the new array [3,3,3,3,5,5,5] which has a size greater than half of the size of the old array.
Example 2:
Input: arr = [7,7,7,7,7,7]
Output: 1
Explanation: The only possible set you can choose is {7}. This will make the new array empty.
Constraints:
2 <= arr.length <= 105arr.length is even.1 <= arr[i] <= 105Problem Overview: You are given an integer array and must remove all occurrences of some chosen values. The goal is to pick the smallest number of unique integers such that removing them deletes at least half of the array elements. The challenge is deciding which values to remove first to minimize the set size.
Approach 1: Sort by Frequency (Greedy) (Time: O(n log n), Space: O(n))
Start by counting how often each number appears using a hash table. Each frequency tells you how many elements disappear if that value is removed. The greedy insight is simple: always remove the most frequent values first because they shrink the array fastest. After building the frequency map, place the counts into a list and sort them in descending order using sorting. Iterate through the sorted frequencies, subtracting them from the remaining array size until at least n / 2 elements are removed. The number of frequencies used is the answer.
This approach is straightforward and works well for interviews because the logic is easy to explain. The dominant cost comes from sorting the frequency list. If there are k unique numbers, sorting costs O(k log k), which is at most O(n log n) in the worst case.
Approach 2: Bucket Sort for Frequency Count (Time: O(n), Space: O(n))
You can avoid sorting by observing that frequencies are bounded by the array size. After counting occurrences with a hash table, create a bucket array where index i stores how many numbers appear exactly i times. This technique is a variation of bucket sorting often used for frequency-based greedy problems.
Iterate through the bucket array from highest frequency to lowest. Each step represents choosing numbers that remove many elements at once. Subtract frequency × count_of_numbers from the remaining elements until at least half of the array is removed. Because each element contributes to exactly one bucket and you scan the bucket array once, the total time complexity stays O(n). This makes it faster than sorting for very large arrays.
The bucket strategy relies on the same greedy idea: prioritize values that remove the most elements. The difference is how the frequencies are processed. Instead of sorting them, the algorithm groups them by frequency and walks from largest to smallest.
Recommended for interviews: The greedy frequency approach with sorting is usually the expected answer. It clearly demonstrates the key idea—remove the most frequent numbers first—and is easy to implement in any language. The bucket-based solution shows deeper optimization skills by reducing the complexity to O(n), which can impress interviewers when you discuss performance improvements.
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Sort by Frequency (Greedy) | O(n log n) | O(n) | General interview solution. Simple implementation using hash map and sorting. |
| Bucket Sort for Frequency Count | O(n) | O(n) | When optimizing beyond sorting. Useful when array size is large and frequency bounds allow bucket grouping. |