Delete Duplicate Emails - Video Solutions

EasyDatabase

LeetCode 196: Delete Duplicate Emails [SQL]

5:1222,252 views

10 video solutions available

Delete Duplicate Emails - Video Solution

Watch 10 video solutions for Delete Duplicate Emails, a easy level problem involving Database. This walkthrough by Frederik Müller has 22,252 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.

Problem Statement

Table: Person

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| id          | int     |
| email       | varchar |
+-------------+---------+
id is the primary key (column with unique values) for this table.
Each row of this table contains an email. The emails will not contain uppercase letters.

Write a solution to delete all duplicate emails, keeping only one unique email with the smallest id.

For SQL users, please note that you are supposed to write a DELETE statement and not a SELECT one.

For Pandas users, please note that you are supposed to modify Person in place.

After running your script, the answer shown is the Person table. The driver will first compile and run your piece of code and then show the Person table. The final order of the Person table does not matter.

The result format is in the following example.

Example 1:

Input: 
Person table:
+----+------------------+
| id | email            |
+----+------------------+
| 1  | john@example.com |
| 2  | bob@example.com  |
| 3  | john@example.com |
+----+------------------+
Output: 
+----+------------------+
| id | email            |
+----+------------------+
| 1  | john@example.com |
| 2  | bob@example.com  |
+----+------------------+
Explanation: john@example.com is repeated two times. We keep the row with the smallest Id = 1.

Read full problem with examples

Approach Overview

Problem Overview: The Delete Duplicate Emails problem gives you a Person table containing id and email. Multiple rows may share the same email. The task is to remove duplicate records while keeping only the row with the smallest id for each email address.

Approach 1: SQL Using EXISTS Clause (O(n^2) time, O(1) space)

This method deletes rows if another row with the same email and a smaller id already exists. The query checks duplicates using an EXISTS subquery. For every row, the database scans for a matching email with a lower identifier. If found, the current row is removed. The approach is concise and works well in relational databases that optimize correlated subqueries. This technique relies heavily on concepts from SQL and database query optimization.

Approach 2: Using SQL with Self-Join (O(n^2) time, O(1) space)

A self-join compares rows in the same table. The query joins Person p1 with Person p2 where the email values match but p1.id > p2.id. That condition identifies duplicates that should be removed because a smaller id already exists. The delete operation targets the larger id rows. Self-joins are widely used for record comparison tasks and are often easier to reason about than nested subqueries.

Approach 3: Using a Hash Table (Dictionary) (O(n) time, O(n) space)

If the data is processed in application code instead of directly in SQL, a hash table can track emails already seen. Iterate through each row, and store the first occurrence of every email in a dictionary keyed by email address. If another row with the same email appears, mark it for deletion. Hash lookups run in constant time, so the entire process completes in linear time. This technique demonstrates the power of hash tables for duplicate detection.

Approach 4: SQL-Based Duplicate Removal with Aggregation (O(n log n) time, O(n) space)

Another SQL pattern uses grouping or window functions to identify the smallest id per email. For example, you compute the minimum id for each email group and delete rows whose id is larger than that minimum. This approach is especially useful when the database engine optimizes grouping operations efficiently. It is common in production SQL cleanup tasks where deduplication happens in batches.

Recommended for interviews: The SQL EXISTS solution is the most common answer because it is compact and expresses the logic directly: delete rows where a smaller id with the same email already exists. Mentioning the self-join variant shows deeper SQL understanding. In application-layer problems, the hash table solution demonstrates algorithmic thinking and achieves O(n) time.

Complexity Analysis

Approach	Time	Space	When to Use
SQL Using EXISTS Clause	O(n^2)	O(1)	Standard SQL solution for deleting duplicates while keeping smallest id
SQL Self-Join	O(n^2)	O(1)	Useful when comparing rows within the same table
Hash Table (Dictionary)	O(n)	O(n)	When processing data outside the database in application code
SQL Aggregation / Grouping	O(n log n)	O(n)	When using GROUP BY or window functions to identify unique rows

Continue Learning

Practice this problem

Solve it on FleetCode

Read Solution & Explanation

Detailed text walkthrough

Delete Duplicate Emails - Video Solutions

LeetCode 196: Delete Duplicate Emails [SQL]

Delete Duplicate Emails | Leetcode 196 | Crack SQL Interviews in 50 Qs #mysql #leetcode

Leetcode 196 - Delete Duplicate Emails | SQL CTE Interview Question

Leetcode 196 - Delete Duplicate Emails - Python Solution | FAANG Interviews

196. Delete Duplicate Emails leetcode Solution

LeetCode 196 - Delete Duplicate Emails (Python and SQL) [Easy]

SQL 196 Delete Duplicate Emails

196. Delete Duplicate Emails | LeetCode | SQL | Solution | Interview Problem

196 Delete Duplicate Emails | 80/100 Days of SQL Challenge | Leetcode

LeetCode 196 | Delete Duplicate Emails | Delete | SQL

Delete Duplicate Emails - Video Solution

Problem Statement

Approach Overview

Complexity Analysis

Continue Learning

Delete Duplicate Emails - Video Solutions

LeetCode 196: Delete Duplicate Emails [SQL]

Delete Duplicate Emails | Leetcode 196 | Crack SQL Interviews in 50 Qs #mysql #leetcode

Leetcode 196 - Delete Duplicate Emails | SQL CTE Interview Question

Leetcode 196 - Delete Duplicate Emails - Python Solution | FAANG Interviews

196. Delete Duplicate Emails leetcode Solution

LeetCode 196 - Delete Duplicate Emails (Python and SQL) [Easy]

SQL 196 Delete Duplicate Emails

196. Delete Duplicate Emails | LeetCode | SQL | Solution | Interview Problem

196 Delete Duplicate Emails | 80/100 Days of SQL Challenge | Leetcode

LeetCode 196 | Delete Duplicate Emails | Delete | SQL

Delete Duplicate Emails - Video Solution

Problem Statement

Approach Overview

Complexity Analysis

Continue Learning