Watch 4 video solutions for Number of Distinct Substrings in a String, a medium level problem involving String, Trie, Rolling Hash. This walkthrough by take U forward has 118,962 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.
Given a string s, return the number of distinct substrings of s.
A substring of a string is obtained by deleting any number of characters (possibly zero) from the front of the string and any number (possibly zero) from the back of the string.
Example 1:
Input: s = "aabbaba" Output: 21 Explanation: The set of distinct strings is ["a","b","aa","bb","ab","ba","aab","abb","bab","bba","aba","aabb","abba","bbab","baba","aabba","abbab","bbaba","aabbab","abbaba","aabbaba"]
Example 2:
Input: s = "abcdefg" Output: 28
Constraints:
1 <= s.length <= 500s consists of lowercase English letters.Follow up: Can you solve this problem in
O(n) time complexity?Problem Overview: Given a string s, count how many different substrings appear in it. Substrings with the same characters and order count only once, even if they appear multiple times at different positions.
Approach 1: Brute Force Enumeration (O(n^3) time, O(n^2) space)
The most direct strategy is to generate every possible substring and track which ones are unique. Use two nested loops to choose the start and end index of each substring, then extract s[i:j] and insert it into a hash set. Because substring creation itself can take O(n) time, the total complexity becomes O(n^3) in languages where slicing copies characters. The set ensures duplicates are removed automatically. This approach is easy to implement and useful for understanding the problem, but it becomes slow for large strings because the number of substrings grows to n(n+1)/2. It relies mainly on basic string operations and hash sets.
Approach 2: Rolling Hash (String Hashing) (O(n^2) time, O(n^2) space)
A faster approach avoids repeatedly copying substrings by hashing them. Use a polynomial rolling hash so each substring hash can be computed incrementally while expanding the right boundary. For each starting index i, extend the substring character by character and update the hash value in constant time using hash = hash * base + value. Store each computed hash in a set. Because hash updates are O(1), generating all substring hashes takes O(n^2) time instead of O(n^3). The set of hashes represents distinct substrings without storing the full text. This technique is common in rolling hash and substring comparison problems.
More advanced structures such as a trie, suffix automaton, or suffix array can also solve this problem efficiently by representing all suffixes of the string and counting unique paths. Those methods can reach near O(n) or O(n log n) complexity but require significantly more implementation effort compared to hashing.
Recommended for interviews: Start by explaining the brute force enumeration to show you understand how substrings are generated and how duplicates arise. Then move to the rolling hash optimization. Interviewers typically expect the O(n^2) hashing solution because it demonstrates knowledge of substring hashing and efficient duplicate detection while keeping the implementation manageable during a coding interview.
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Brute Force Enumeration with Set | O(n^3) | O(n^2) | Good for understanding substring generation or when input size is very small |
| Rolling Hash (String Hashing) | O(n^2) | O(n^2) | General solution for interviews and competitive programming where substring comparison must be efficient |