Design a HashSet without using any built-in hash table libraries.
Implement MyHashSet class:
void add(key) Inserts the value key into the HashSet.bool contains(key) Returns whether the value key exists in the HashSet or not.void remove(key) Removes the value key in the HashSet. If key does not exist in the HashSet, do nothing.
Example 1:
Input ["MyHashSet", "add", "add", "contains", "contains", "add", "contains", "remove", "contains"] [[], [1], [2], [1], [3], [2], [2], [2], [2]] Output [null, null, null, true, false, null, true, null, false] Explanation MyHashSet myHashSet = new MyHashSet(); myHashSet.add(1); // set = [1] myHashSet.add(2); // set = [1, 2] myHashSet.contains(1); // return True myHashSet.contains(3); // return False, (not found) myHashSet.add(2); // set = [1, 2] myHashSet.contains(2); // return True myHashSet.remove(2); // set = [1] myHashSet.contains(2); // return False, (already removed)
Constraints:
0 <= key <= 106104 calls will be made to add, remove, and contains.Problem Overview: Build a HashSet from scratch without using built‑in hash table libraries. The structure must support three operations: add(key), remove(key), and contains(key). Each operation should run efficiently while handling potential key collisions.
Approach 1: Direct Address Table (Time: O(1), Space: O(N))
The simplest design maps every possible key directly to an index in an array. Since the problem constraints limit keys to a known range (typically 0 ≤ key ≤ 10^6), you allocate a boolean array where the index represents the key itself. add(key) sets the slot to true, remove(key) sets it to false, and contains(key) checks the value at that index. Every operation becomes a constant-time array lookup or update. This approach avoids collisions entirely because each key has a unique slot. The tradeoff is memory usage: space complexity is O(N) for the full key range even if only a few keys are stored. It works well when the key range is small and predictable and relies purely on an array rather than a true hash structure.
Approach 2: Chaining Using Arrays for Collision Resolution (Average Time: O(1), Worst: O(N), Space: O(N))
A more realistic hash set uses a hash function and buckets. Create an array of buckets, then compute the bucket index with a simple hash = key % bucketSize. Each bucket stores multiple keys using a small list or linked list. When you call add, compute the hash index and append the key if it does not already exist in that bucket. remove searches the bucket and deletes the key, while contains scans the bucket to check membership. Average time remains O(1) if the hash function distributes keys evenly. In the worst case, when many keys collide into the same bucket, operations degrade to O(N). This design demonstrates how real hash tables handle collisions and is closer to production implementations.
Recommended for interviews: Interviewers typically expect the hashing approach with buckets because it shows you understand collision handling and the role of a hash function. The direct address table is still useful to mention first since it proves you recognize the constant-time lookup property. Moving from the direct mapping idea to a bucketed hash structure demonstrates stronger system design and data structure fundamentals.
The direct address table approach is simple and efficient for this problem since the maximum key size is limited to 106. We can use a boolean array of size 106+1 where each index represents a key. The value stored at a given index indicates whether the key is present in the HashSet. This allows for O(1) time complexity for the 'add', 'remove', and 'contains' operations.
The solution creates a struct with a boolean array of size 1000001. Each index in this array corresponds to a potential key, and its boolean value represents whether that key is currently present in the set. Methods for adding, removing, and checking for keys simply modify or access this boolean array.
Time Complexity: O(1) for all operations (add, remove, contains).
Space Complexity: O(MAX), which is O(10^6).
This approach simulates a hash set using a hash function to assign keys to buckets, dealing with potential collisions using chaining. We create an array of lists, where each list contains keys that hash to the same index. This way, operations require traversing these short lists upon collisions.
In C, we use an array of linked lists to manage collisions. A fixed-size hash-table ('BASE') is initialized, and the hash function determines which index (or 'bucket') a key belongs to. We handle collisions by adding new keys to the list at each bucket. The remove function deletes a node by updating links, and contains searches the linked list to find a key.
Time Complexity: O(1) average, O(n) worst-case for add, remove, and contains due to collision chains.
Space Complexity: O(n), where n is the number of keys actually added.
Directly create an array of size 1000001, initially with each element set to false, indicating that the element does not exist in the hash set.
When adding an element to the hash set, set the corresponding position in the array to true; when deleting an element, set the corresponding position in the array to false; when checking if an element exists, directly return the value at the corresponding position in the array.
The time complexity of the above operations is O(1).
Python
Java
C++
Go
TypeScript
We can also create an array of size SIZE=1000, where each position in the array is a linked list.
| Approach | Complexity |
|---|---|
| Approach 1: Direct Address Table | Time Complexity: O(1) for all operations (add, remove, contains). |
| Approach 2: Chaining using Arrays for Collision Resolution | Time Complexity: O(1) average, O(n) worst-case for add, remove, and contains due to collision chains. |
| Static Array Implementation | — |
| Array of Linked Lists | — |
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Direct Address Table | O(1) | O(N) | When the key range is small and known in advance; simplest constant-time implementation |
| Hashing with Chaining (Bucket Array) | Average O(1), Worst O(N) | O(N) | General hash set design when keys may collide and memory must scale with stored elements |
Design HashSet - Leetcode 705 - Python • NeetCodeIO • 48,508 views views
Watch 9 more video solutions →Practice Design HashSet with our built-in code editor and test cases.
Practice on FleetCode