You are given an integer array arr. You can choose a set of integers and remove all the occurrences of these integers in the array.
Return the minimum size of the set so that at least half of the integers of the array are removed.
Example 1:
Input: arr = [3,3,3,3,5,5,5,2,2,7]
Output: 2
Explanation: Choosing {3,7} will make the new array [5,5,5,2,2] which has size 5 (i.e equal to half of the size of the old array).
Possible sets of size 2 are {3,5},{3,2},{5,2}.
Choosing set {2,7} is not possible as it will make the new array [3,3,3,3,5,5,5] which has a size greater than half of the size of the old array.
Example 2:
Input: arr = [7,7,7,7,7,7]
Output: 1
Explanation: The only possible set you can choose is {7}. This will make the new array empty.
Constraints:
2 <= arr.length <= 105arr.length is even.1 <= arr[i] <= 105Problem Overview: You are given an integer array and must remove all occurrences of some chosen values. The goal is to pick the smallest number of unique integers such that removing them deletes at least half of the array elements. The challenge is deciding which values to remove first to minimize the set size.
Approach 1: Sort by Frequency (Greedy) (Time: O(n log n), Space: O(n))
Start by counting how often each number appears using a hash table. Each frequency tells you how many elements disappear if that value is removed. The greedy insight is simple: always remove the most frequent values first because they shrink the array fastest. After building the frequency map, place the counts into a list and sort them in descending order using sorting. Iterate through the sorted frequencies, subtracting them from the remaining array size until at least n / 2 elements are removed. The number of frequencies used is the answer.
This approach is straightforward and works well for interviews because the logic is easy to explain. The dominant cost comes from sorting the frequency list. If there are k unique numbers, sorting costs O(k log k), which is at most O(n log n) in the worst case.
Approach 2: Bucket Sort for Frequency Count (Time: O(n), Space: O(n))
You can avoid sorting by observing that frequencies are bounded by the array size. After counting occurrences with a hash table, create a bucket array where index i stores how many numbers appear exactly i times. This technique is a variation of bucket sorting often used for frequency-based greedy problems.
Iterate through the bucket array from highest frequency to lowest. Each step represents choosing numbers that remove many elements at once. Subtract frequency × count_of_numbers from the remaining elements until at least half of the array is removed. Because each element contributes to exactly one bucket and you scan the bucket array once, the total time complexity stays O(n). This makes it faster than sorting for very large arrays.
The bucket strategy relies on the same greedy idea: prioritize values that remove the most elements. The difference is how the frequencies are processed. Instead of sorting them, the algorithm groups them by frequency and walks from largest to smallest.
Recommended for interviews: The greedy frequency approach with sorting is usually the expected answer. It clearly demonstrates the key idea—remove the most frequent numbers first—and is easy to implement in any language. The bucket-based solution shows deeper optimization skills by reducing the complexity to O(n), which can impress interviewers when you discuss performance improvements.
This approach involves sorting the elements of the array based on their frequency. We start by counting the frequency of each element. Next, we create a list of these frequencies sorted in decreasing order. By selecting the most frequent elements first, we remove them from the array until at least half of the array size is removed.
The code defines a function minSetSize that calculates the minimum set size required to remove at least half of the integers in the array. It uses an array to count the frequency of each element, sorts this frequency array, and removes elements starting with the most frequent to fulfill the condition. The function returns the size of the set needed.
Time Complexity: O(n log n) due to sorting.
Space Complexity: O(n) for storing frequencies.
This approach utilizes bucket sort, where frequencies are counted and used as indices in a bucket array to represent how many numbers occur with identical frequency. This helps in efficiently choosing elements to remove, optimizing beyond traditional sorting by directly accessing through indices.
This C code implements a bucket-based frequency count. It populates a bucket array where the index represents frequency and the value represents how many numbers have that frequency. By accessing these buckets from highest to lowest, we efficiently calculate the number of elements we need to remove from the array to reach at least half its size.
Time Complexity: O(n) because of bucket sorting.
Space Complexity: O(n) due to auxiliary arrays.
We can use a hash table or an array cnt to count the occurrences of each number in the array arr. Then, we sort the numbers in cnt in descending order. We traverse cnt from largest to smallest, adding the current number x to the answer and adding x to m. If m geq \frac{n}{2}, we return the answer.
The time complexity is O(n times log n), and the space complexity is O(n). Here, n is the length of the array arr.
Python
Java
C++
Go
TypeScript
| Approach | Complexity |
|---|---|
| Sort by Frequency | Time Complexity: O(n log n) due to sorting. |
| Bucket Sort for Frequency Count | Time Complexity: O(n) because of bucket sorting. |
| Counting + Sorting | — |
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Sort by Frequency (Greedy) | O(n log n) | O(n) | General interview solution. Simple implementation using hash map and sorting. |
| Bucket Sort for Frequency Count | O(n) | O(n) | When optimizing beyond sorting. Useful when array size is large and frequency bounds allow bucket grouping. |
Leetcode 1338. Reduce Array Size to The Half • Fraz • 10,495 views views
Watch 9 more video solutions →Practice Reduce Array Size to The Half with our built-in code editor and test cases.
Practice on FleetCode