Every valid email consists of a local name and a domain name, separated by the '@' sign. Besides lowercase letters, the email may contain one or more '.' or '+'.
"alice@leetcode.com", "alice" is the local name, and "leetcode.com" is the domain name.If you add periods '.' between some characters in the local name part of an email address, mail sent there will be forwarded to the same address without dots in the local name. Note that this rule does not apply to domain names.
"alice.z@leetcode.com" and "alicez@leetcode.com" forward to the same email address.If you add a plus '+' in the local name, everything after the first plus sign will be ignored. This allows certain emails to be filtered. Note that this rule does not apply to domain names.
"m.y+name@email.com" will be forwarded to "my@email.com".It is possible to use both of these rules at the same time.
Given an array of strings emails where we send one email to each emails[i], return the number of different addresses that actually receive mails.
Example 1:
Input: emails = ["test.email+alex@leetcode.com","test.e.mail+bob.cathy@leetcode.com","testemail+david@lee.tcode.com"] Output: 2 Explanation: "testemail@leetcode.com" and "testemail@lee.tcode.com" actually receive mails.
Example 2:
Input: emails = ["a@leetcode.com","b@leetcode.com","c@leetcode.com"] Output: 3
Constraints:
1 <= emails.length <= 1001 <= emails[i].length <= 100emails[i] consist of lowercase English letters, '+', '.' and '@'.emails[i] contains exactly one '@' character.'+' character.".com" suffix.".com" suffix.In #929 Unique Email Addresses, the goal is to determine how many unique email addresses actually receive emails after applying specific normalization rules. Each email consists of a local name and a domain name separated by @. The key observation is that characters after a + in the local name are ignored, and all . characters in the local name should be removed before processing.
A common approach is to iterate through the list of emails and normalize each one by applying these rules. After reconstructing the cleaned local name with its original domain, store the resulting email in a hash set. Since sets only keep unique values, this naturally eliminates duplicates.
This method is efficient because each email is processed once and normalization involves simple string operations. Using a hash-based structure allows quick insertion and duplicate checking.
The overall complexity depends on the number of emails and their average length, making this approach both practical and optimal for interview settings.
| Approach | Time Complexity | Space Complexity |
|---|---|---|
| Email Normalization with Hash Set | O(n * m) | O(n) |
NeetCode
In this approach, we'll use a set to store unique email addresses. We'll iterate over each email, split it into local and domain parts, then manipulate the local part by removing any characters after the first '+' and all periods '.' before rejoining with the domain. Finally, we'll add the standardized email to the set. The size of the set gives the count of unique emails.
Time Complexity: O(N * M) where N is the number of emails and M is the maximum length of an email address.
Space Complexity: O(N) due to the storage of unique emails in a set.
1def num_unique_emails(emails):
2 unique_emails = set()
3 for email in emails:
4 local, domain = email.split('@')
5In Python, we use a set to hold unique email addresses. We split each email by '@' to get local and domain parts. The local part is then split at the '+' to ignore everything beyond it and replaced '.' with ''. The processed local part is then combined with the domain part before adding to the set. The number of unique emails is the size of the set.
In this approach, instead of transforming emails directly using a set, we'll utilize arrays to manipulate the email parts. We'll maintain a boolean array for discovered unique emails and an index array to track up to where the email is unique. This efficient tracking helps us determine unique addresses without full storage, but more focused on transformations directly performed within the iteration loop.
Time Complexity: O(N * M), since per email processed once fully.
Space Complexity: O(N), enabling efficient tracking of unique addresses.
1Watch expert explanations and walkthroughs
Practice problems asked by these companies to ace your technical interviews.
Explore More ProblemsJot down your thoughts, approach, and key learnings
Yes, this problem or similar string normalization questions can appear in FAANG-style interviews. It tests string manipulation, understanding of hashing, and the ability to apply rules efficiently to large inputs.
A hash set is the best data structure because it efficiently stores unique normalized email strings. It allows constant-time average insertion and lookup, making it ideal for eliminating duplicates during processing.
The optimal approach is to normalize each email according to the given rules and store the results in a hash set. This ensures duplicates are removed automatically while maintaining efficient lookups. Each processed email contributes at most one entry to the set.
The problem defines rules where dots in the local name are ignored and characters after a plus sign are disregarded. These transformations simulate how some email providers process addresses, so applying them ensures different representations map to the same effective inbox.
This is similar to Approach 1, but a slight variation in explanation to describe transformations as stepwise and using string-based mutation directly. The Python simplicity helps accomplish similar efficiency with identical methods.