You are given an array arr of size n consisting of non-empty strings.
Find a string array answer of size n such that:
answer[i] is the shortest substring of arr[i] that does not occur as a substring in any other string in arr. If multiple such substrings exist, answer[i] should be the lexicographically smallest. And if no such substring exists, answer[i] should be an empty string.Return the array answer.
Example 1:
Input: arr = ["cab","ad","bad","c"] Output: ["ab","","ba",""] Explanation: We have the following: - For the string "cab", the shortest substring that does not occur in any other string is either "ca" or "ab", we choose the lexicographically smaller substring, which is "ab". - For the string "ad", there is no substring that does not occur in any other string. - For the string "bad", the shortest substring that does not occur in any other string is "ba". - For the string "c", there is no substring that does not occur in any other string.
Example 2:
Input: arr = ["abc","bcd","abcd"] Output: ["","","abcd"] Explanation: We have the following: - For the string "abc", there is no substring that does not occur in any other string. - For the string "bcd", there is no substring that does not occur in any other string. - For the string "abcd", the shortest substring that does not occur in any other string is "abcd".
Constraints:
n == arr.length2 <= n <= 1001 <= arr[i].length <= 20arr[i] consists only of lowercase English letters.Problem Overview: Given an array of strings, return the shortest substring of each string that does not appear in any other string in the array. If multiple substrings satisfy the condition, choose the lexicographically smallest one. If none exists, return an empty string for that index.
Approach 1: Brute Force with Substring Check (O(n * m^3) time, O(1) extra space)
Generate every possible substring for each string and check whether that substring appears in any other string in the array. For a string of length m, there are O(m^2) substrings. For each candidate substring, iterate through the remaining n-1 strings and use a substring search operation like find(). This leads to roughly O(n * m^3) complexity in the worst case. While slow, it clearly demonstrates the definition of an "uncommon substring" and is useful for understanding the problem before optimizing. This approach mainly relies on simple string operations and iteration across the array.
Approach 2: Optimized Set / Hash Map Approach (O(n * m^2) time, O(n * m^2) space)
Instead of repeatedly scanning other strings, precompute how often each substring appears across the entire array. For every string, generate all substrings and store them in a set to avoid duplicates within the same string. Insert those substrings into a global hash table that tracks how many different strings contain that substring. After building this frequency map, iterate again through each string's substrings ordered by increasing length and lexicographical order. The first substring with frequency 1 is the answer for that string. This reduces repeated scanning and replaces it with constant‑time hash lookups.
This optimization works because substring uniqueness is determined globally. Once substring frequencies are known, checking whether a substring is unique becomes a simple hash lookup instead of scanning every other string. Substring generation remains O(m^2), which dominates the runtime.
A more advanced variation can store substrings inside a trie or suffix structure, but for typical interview constraints the hash‑set approach is simpler and performs well.
Recommended for interviews: Start by explaining the brute force idea since it directly models the problem definition. Then move to the optimized hash‑map approach that precomputes substring frequencies. Interviewers typically expect the second solution because it reduces repeated work and demonstrates practical use of substring enumeration with hash lookups.
This approach involves generating all possible substrings for each string in arr and checking if each substring appears in any other string in arr. We keep track of the shortest substring for each string that does not appear in any of the other strings. To ensure the result is the lexicographically smallest substring when two are of equal length, sort the substrings and select the first valid one.
This Python function finds the shortest unique substring for each in the array. It first generates all possible substrings for a given string and checks if they appear in any other string, ensuring it also accounts for lexicographical order by sorting the substrings. The function get_substrings generates all possible substrings, and the function is_unique determines if a substring is unique across other strings in the array.
Time Complexity: O(n * m^3) where n is the number of strings and m is the maximal length of the strings. We generate all substrings of a string, which is O(m^2), and then for each substring check it against other strings which is another O(n*m) check.
Space Complexity: O(n * m) due to storage of substrings, assuming the worst case.
Instead of iterating through all substrings, we can use sets to store already seen substrings for each string. We compare for uniqueness by checking each substring against combined sets of other strings. This can be implemented using hash maps to store substrings to avoid overlaps efficiently. We also keep track of the first occurrence of a substring length-wise to ensure the answer is lexicographically smallest when possible.
This C++ implementation utilizes sets to store substrings of all other strings compared to the current string. By calculating all substrings in a non-target string, we avoid redundancy and save on re-calculating seen substrings, providing optimized performance. This satisfies uniqueness check requirements effectively.
Time Complexity: Approx O(n^2 * m^2 log m) due to hash operations (insertions/checks are approximately O(1) but dependent on number of substrings stored).
Space Complexity: O(n * m^2) for storing substrings in sets, albeit more efficient than full array checks.
Given the small data scale, we can directly enumerate all substrings of each string and then determine whether it is a substring of other strings.
Specifically, we first enumerate each string arr[i], then enumerate the length j of each substring from small to large, and then enumerate the starting position l of each substring. We can get the current substring as sub = arr[i][l:l+j]. Then we determine whether sub is a substring of other strings. If it is, we skip the current substring; otherwise, we update the answer.
The time complexity is O(n^2 times m^4), and the space complexity is O(m). Where n is the length of the string array arr, and m is the maximum length of the string. In this problem, m \le 20.
Python
Java
C++
Go
TypeScript
| Approach | Complexity |
|---|---|
| Brute Force Approach with Substring Check | Time Complexity: O(n * m^3) where n is the number of strings and m is the maximal length of the strings. We generate all substrings of a string, which is O(m^2), and then for each substring check it against other strings which is another O(n*m) check. |
| Optimized Set Approach | Time Complexity: Approx O(n^2 * m^2 log m) due to hash operations (insertions/checks are approximately O(1) but dependent on number of substrings stored). |
| Enumeration | — |
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Brute Force Substring Check | O(n * m^3) | O(1) | Useful for understanding the definition of uncommon substrings or when constraints are extremely small |
| Optimized Hash Set / Frequency Map | O(n * m^2) | O(n * m^2) | General case solution. Avoids repeated scanning of other strings using hash lookups |
Shortest Uncommon Substring in an Array | Detailed Intuition | Dry Run | Leetcode 3076 | Contest 388 • codestorywithMIK • 4,255 views views
Watch 9 more video solutions →Practice Shortest Uncommon Substring in an Array with our built-in code editor and test cases.
Practice on FleetCode