Design a HashSet without using any built-in hash table libraries.
Implement MyHashSet class:
void add(key) Inserts the value key into the HashSet.bool contains(key) Returns whether the value key exists in the HashSet or not.void remove(key) Removes the value key in the HashSet. If key does not exist in the HashSet, do nothing.Example 1:
Input ["MyHashSet", "add", "add", "contains", "contains", "add", "contains", "remove", "contains"] [[], [1], [2], [1], [3], [2], [2], [2], [2]] Output [null, null, null, true, false, null, true, null, false] Explanation MyHashSet myHashSet = new MyHashSet(); myHashSet.add(1); // set = [1] myHashSet.add(2); // set = [1, 2] myHashSet.contains(1); // return True myHashSet.contains(3); // return False, (not found) myHashSet.add(2); // set = [1, 2] myHashSet.contains(2); // return True myHashSet.remove(2); // set = [1] myHashSet.contains(2); // return False, (already removed)
Constraints:
0 <= key <= 106104 calls will be made to add, remove, and contains.In #705 Design HashSet, you need to implement a basic HashSet without using built-in hash table libraries. The goal is to support operations like add, remove, and contains efficiently.
A common strategy is to use a hashing technique with buckets. A hash function (often key % bucketSize) maps each key to a specific bucket index. Since multiple keys may map to the same bucket, collisions must be handled using separate chaining, typically implemented with a linked list or dynamic array inside each bucket.
When inserting, compute the bucket index and store the key if it doesn’t already exist. For removal or lookup, traverse the bucket structure to find the key. Choosing a suitable bucket size helps distribute keys evenly and reduces collisions.
With a well-designed hash function and bucket distribution, most operations achieve average O(1) time complexity, while space complexity depends on the number of stored elements and bucket allocation.
| Approach | Time Complexity | Space Complexity |
|---|---|---|
| Hash Buckets with Separate Chaining | Average: O(1) for add, remove, contains Worst: O(n) if many collisions occur | O(n + b) where n is number of keys and b is number of buckets |
Java Brains
The direct address table approach is simple and efficient for this problem since the maximum key size is limited to 106. We can use a boolean array of size 106+1 where each index represents a key. The value stored at a given index indicates whether the key is present in the HashSet. This allows for O(1) time complexity for the 'add', 'remove', and 'contains' operations.
Time Complexity: O(1) for all operations (add, remove, contains).
Space Complexity: O(MAX), which is O(10^6).
1class MyHashSet:
2 def __init__(self):
3 self.container = [False] * 1000001
4
5 def add(self
In Python, we make use of a list (an array-like structure) initialized with False values. The add, remove, and contains functions manipulate or query this list.
This approach simulates a hash set using a hash function to assign keys to buckets, dealing with potential collisions using chaining. We create an array of lists, where each list contains keys that hash to the same index. This way, operations require traversing these short lists upon collisions.
Time Complexity: O(1) average, O(n) worst-case for add, remove, and contains due to collision chains.
Space Complexity: O(n), where n is the number of keys actually added.
1
Watch expert explanations and walkthroughs
Practice problems asked by these companies to ace your technical interviews.
Explore More ProblemsJot down your thoughts, approach, and key learnings
Yes, HashSet or hash table design problems are commonly asked in FAANG and other top tech interviews. They test understanding of hashing, collision handling, and efficient data structure design.
A hash table structure with buckets is best for implementing a HashSet. Each bucket can internally use a linked list or array to store elements that hash to the same index. This design efficiently handles collisions and maintains fast operations.
The optimal approach is to use hashing with buckets and handle collisions using separate chaining. A hash function maps keys to bucket indices, and each bucket stores elements using a linked list or dynamic list. This allows average O(1) time for insert, delete, and lookup operations.
Collisions occur when multiple keys map to the same bucket index using the hash function. Without collision handling, data could be overwritten or lost. Techniques like separate chaining ensure multiple values can safely exist in the same bucket.
In Python, we use a class 'Bucket' holding keys as lists and another class 'MyHashSet' to manage these buckets. Each bucket corresponds to keys by modulus operation, and operations use list membership test and updates to handle keys in buckets.