Design HashSet - Solution & Explanation

Q: Is Design HashSet easy or hard?

Design HashSet is classified as an Easy problem because the required operations are straightforward. The challenge mainly checks whether you understand hashing basics and how to implement set behavior without relying on built‑in libraries.

Q: Design HashSet Python/Java solution

Python and Java solutions typically implement a bucket array with a simple modulo hash function. Each bucket stores keys in a list or linked structure. Methods add, remove, and contains compute the hash index and then operate only within that bucket, giving O(1) average complexity.

Q: How to solve Design HashSet in O(n)?

Operations themselves should run in O(1) average time rather than O(N). Using a hash table with buckets ensures constant-time lookups and updates. The overall structure stores up to N elements, giving O(N) space while each individual operation remains constant on average.

Q: What is the best approach for Design HashSet?

The hashing approach with bucket arrays and chaining is the most practical solution. Keys are mapped to buckets using a hash function such as key % bucketSize, and collisions are handled using a list inside each bucket. Average time for add, remove, and contains becomes O(1) with O(N) total space.

Q: Is Design HashSet asked at Google/Amazon/Meta?

Hash table design questions like Design HashSet frequently appear in interviews at companies such as Amazon, Meta, and Google because they test understanding of hashing, collision handling, and data structure design. Candidates are often asked to explain tradeoffs between direct addressing and bucket-based hashing.

Q: What data structure is used in Design HashSet?

The core data structure is a hash table. It typically consists of an array of buckets combined with a hash function to map keys to indices. Buckets may store elements using linked lists or dynamic arrays to resolve collisions.

Q: What is the time complexity of Design HashSet?

Most implementations achieve average O(1) time for add, remove, and contains operations using hashing. When chaining is used for collision resolution, worst‑case time becomes O(N) if many keys fall into the same bucket. Space complexity is O(N) for storing the keys.

EasyArray Hash Table Linked List Design20 min readAsked at: Amazon, Meta, Google +3

Practice this problem

Problem Statement

Design a HashSet without using any built-in hash table libraries.

Implement MyHashSet class:

void add(key) Inserts the value key into the HashSet.
bool contains(key) Returns whether the value key exists in the HashSet or not.
void remove(key) Removes the value key in the HashSet. If key does not exist in the HashSet, do nothing.

Example 1:

Input
["MyHashSet", "add", "add", "contains", "contains", "add", "contains", "remove", "contains"]
[[], [1], [2], [1], [3], [2], [2], [2], [2]]
Output
[null, null, null, true, false, null, true, null, false]

Explanation
MyHashSet myHashSet = new MyHashSet();
myHashSet.add(1);      // set = [1]
myHashSet.add(2);      // set = [1, 2]
myHashSet.contains(1); // return True
myHashSet.contains(3); // return False, (not found)
myHashSet.add(2);      // set = [1, 2]
myHashSet.contains(2); // return True
myHashSet.remove(2);   // set = [1]
myHashSet.contains(2); // return False, (already removed)

Constraints:

0 <= key <= 10⁶
At most 10⁴ calls will be made to add, remove, and contains.

Approach Overview

Problem Overview: Build a HashSet from scratch without using built‑in hash table libraries. The structure must support three operations: add(key), remove(key), and contains(key). Each operation should run efficiently while handling potential key collisions.

Approach 1: Direct Address Table (Time: O(1), Space: O(N))

The simplest design maps every possible key directly to an index in an array. Since the problem constraints limit keys to a known range (typically 0 ≤ key ≤ 10^6), you allocate a boolean array where the index represents the key itself. add(key) sets the slot to true, remove(key) sets it to false, and contains(key) checks the value at that index. Every operation becomes a constant-time array lookup or update. This approach avoids collisions entirely because each key has a unique slot. The tradeoff is memory usage: space complexity is O(N) for the full key range even if only a few keys are stored. It works well when the key range is small and predictable and relies purely on an array rather than a true hash structure.

Approach 2: Chaining Using Arrays for Collision Resolution (Average Time: O(1), Worst: O(N), Space: O(N))

A more realistic hash set uses a hash function and buckets. Create an array of buckets, then compute the bucket index with a simple hash = key % bucketSize. Each bucket stores multiple keys using a small list or linked list. When you call add, compute the hash index and append the key if it does not already exist in that bucket. remove searches the bucket and deletes the key, while contains scans the bucket to check membership. Average time remains O(1) if the hash function distributes keys evenly. In the worst case, when many keys collide into the same bucket, operations degrade to O(N). This design demonstrates how real hash tables handle collisions and is closer to production implementations.

Recommended for interviews: Interviewers typically expect the hashing approach with buckets because it shows you understand collision handling and the role of a hash function. The direct address table is still useful to mention first since it proves you recognize the constant-time lookup property. Moving from the direct mapping idea to a bucketed hash structure demonstrates stronger system design and data structure fundamentals.

Approach 1: Approach 1: Direct Address Table

The direct address table approach is simple and efficient for this problem since the maximum key size is limited to 10⁶. We can use a boolean array of size 10⁶+1 where each index represents a key. The value stored at a given index indicates whether the key is present in the HashSet. This allows for O(1) time complexity for the 'add', 'remove', and 'contains' operations.

The solution creates a struct with a boolean array of size 1000001. Each index in this array corresponds to a potential key, and its boolean value represents whether that key is currently present in the set. Methods for adding, removing, and checking for keys simply modify or access this boolean array.

Code

C C++Java Python C#JavaScript

C++

Java

Python

JavaScript

Complexity

Time Complexity: O(1) for all operations (add, remove, contains).
Space Complexity: O(MAX), which is O(10^6).

Try this approach in the editor →

Approach 2: Approach 2: Chaining using Arrays for Collision Resolution

This approach simulates a hash set using a hash function to assign keys to buckets, dealing with potential collisions using chaining. We create an array of lists, where each list contains keys that hash to the same index. This way, operations require traversing these short lists upon collisions.

In C, we use an array of linked lists to manage collisions. A fixed-size hash-table ('BASE') is initialized, and the hash function determines which index (or 'bucket') a key belongs to. We handle collisions by adding new keys to the list at each bucket. The remove function deletes a node by updating links, and contains searches the linked list to find a key.

Code

C C++Java Python C#JavaScript

C++

Java

Python

JavaScript

Complexity

Time Complexity: O(1) average, O(n) worst-case for add, remove, and contains due to collision chains.
Space Complexity: O(n), where n is the number of keys actually added.

Try this approach in the editor →

Approach 3: Static Array Implementation

Directly create an array of size 1000001, initially with each element set to false, indicating that the element does not exist in the hash set.

When adding an element to the hash set, set the corresponding position in the array to true; when deleting an element, set the corresponding position in the array to false; when checking if an element exists, directly return the value at the corresponding position in the array.

The time complexity of the above operations is O(1).

Code

Python Java C++Go TypeScript

Python

Java

C++

TypeScript

Try this approach in the editor →

Approach 4: Array of Linked Lists

We can also create an array of size SIZE=1000, where each position in the array is a linked list.

Code

Python Java C++Go

Python

Java

C++

Try this approach in the editor →

Complexity Comparison

Approach	Complexity
Approach 1: Direct Address Table	Time Complexity: O(1) for all operations (add, remove, contains). Space Complexity: O(MAX), which is O(10^6).
Approach 2: Chaining using Arrays for Collision Resolution	Time Complexity: O(1) average, O(n) worst-case for add, remove, and contains due to collision chains. Space Complexity: O(n), where n is the number of keys actually added.
Static Array Implementation	—
Array of Linked Lists	—

Detailed Complexity Analysis

Approach	Time	Space	When to Use
Direct Address Table	O(1)	O(N)	When the key range is small and known in advance; simplest constant-time implementation
Hashing with Chaining (Bucket Array)	Average O(1), Worst O(N)	O(N)	General hash set design when keys may collide and memory must scale with stored elements

Video Solution

Design HashSet - Leetcode 705 - Python • NeetCodeIO • 48,508 views views

Watch 9 more video solutions →

Frequently Asked Questions

Is Design HashSet easy or hard?

Design HashSet is classified as an Easy problem because the required operations are straightforward. The challenge mainly checks whether you understand hashing basics and how to implement set behavior without relying on built‑in libraries.

Design HashSet Python/Java solution

Python and Java solutions typically implement a bucket array with a simple modulo hash function. Each bucket stores keys in a list or linked structure. Methods add, remove, and contains compute the hash index and then operate only within that bucket, giving O(1) average complexity.

How to solve Design HashSet in O(n)?

Operations themselves should run in O(1) average time rather than O(N). Using a hash table with buckets ensures constant-time lookups and updates. The overall structure stores up to N elements, giving O(N) space while each individual operation remains constant on average.

What is the best approach for Design HashSet?

The hashing approach with bucket arrays and chaining is the most practical solution. Keys are mapped to buckets using a hash function such as key % bucketSize, and collisions are handled using a list inside each bucket. Average time for add, remove, and contains becomes O(1) with O(N) total space.

Is Design HashSet asked at Google/Amazon/Meta?

Hash table design questions like Design HashSet frequently appear in interviews at companies such as Amazon, Meta, and Google because they test understanding of hashing, collision handling, and data structure design. Candidates are often asked to explain tradeoffs between direct addressing and bucket-based hashing.

What data structure is used in Design HashSet?

The core data structure is a hash table. It typically consists of an array of buckets combined with a hash function to map keys to indices. Buckets may store elements using linked lists or dynamic arrays to resolve collisions.

What is the time complexity of Design HashSet?

Most implementations achieve average O(1) time for add, remove, and contains operations using hashing. When chaining is used for collision resolution, worst‑case time becomes O(N) if many keys fall into the same bucket. Space complexity is O(N) for storing the keys.

Ready to solve this problem?

Practice Design HashSet with our built-in code editor and test cases.

Practice on FleetCode

Design HashMap

Design Skiplist

Problem Info

DifficultyEasy

Acceptance67.9%

Approaches4

Reading time20 min

Asked at

Amazon Meta Google Bloomberg Marqeta

Practice this problem

Open in Editor

Design HashSet - Solution & Explanation

Problem Statement

Approach Overview

Approach 1: Approach 1: Direct Address Table

Code

Complexity

Approach 2: Approach 2: Chaining using Arrays for Collision Resolution

Code

Complexity

Approach 3: Static Array Implementation

Code

Approach 4: Array of Linked Lists

Code

Complexity Comparison

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Ready to solve this problem?

Problem Info

Table of Contents

Design HashSet - Solution & Explanation

Problem Statement

Approach Overview

Approach 1: Approach 1: Direct Address Table

Code

Complexity

Approach 2: Approach 2: Chaining using Arrays for Collision Resolution

Code

Complexity

Approach 3: Static Array Implementation

Code

Approach 4: Array of Linked Lists

Code

Complexity Comparison

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Ready to solve this problem?

Problem Info

Table of Contents

Problem Statement

Approach Overview

Approach 1: Approach 1: Direct Address Table

Code

Complexity

Approach 2: Approach 2: Chaining using Arrays for Collision Resolution

Code

Complexity

Approach 3: Static Array Implementation

Code

Approach 4: Array of Linked Lists

Code

Complexity Comparison

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Related Problems

Ready to solve this problem?

Problem Info

Table of Contents

Problem Statement

Approach Overview

Approach 1: Approach 1: Direct Address Table

Code

Complexity

Approach 2: Approach 2: Chaining using Arrays for Collision Resolution

Code

Complexity

Approach 3: Static Array Implementation

Code

Approach 4: Array of Linked Lists

Code

Complexity Comparison

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Related Problems

Ready to solve this problem?

Problem Info

Table of Contents