Given a string s, return the number of homogenous substrings of s. Since the answer may be too large, return it modulo 109 + 7.
A string is homogenous if all the characters of the string are the same.
A substring is a contiguous sequence of characters within a string.
Example 1:
Input: s = "abbcccaa" Output: 13 Explanation: The homogenous substrings are listed as below: "a" appears 3 times. "aa" appears 1 time. "b" appears 2 times. "bb" appears 1 time. "c" appears 3 times. "cc" appears 2 times. "ccc" appears 1 time. 3 + 1 + 2 + 1 + 3 + 2 + 1 = 13.
Example 2:
Input: s = "xy" Output: 2 Explanation: The homogenous substrings are "x" and "y".
Example 3:
Input: s = "zzzzz" Output: 15
Constraints:
1 <= s.length <= 105s consists of lowercase letters.Problem Overview: Given a string s, count how many substrings consist of only one repeating character. A substring like "aaa" contributes multiple homogenous substrings: a, a, a, aa, aa, and aaa. The result must be returned modulo 1e9 + 7.
Approach 1: Two-Pointer Approach (O(n) time, O(1) space)
Traverse the string while tracking the length of the current run of identical characters. Use two pointers or a single index with a counter. When the current character matches the previous one, extend the run; otherwise reset the run length to 1. Each extension adds the current run length to the total count because every new character forms additional homogenous substrings ending at that index. This works because a run of length k contributes exactly k substrings ending at the current position.
This approach relies on simple iteration and constant memory, making it ideal for large inputs. The algorithm scans the string once, updates a running count, and applies the modulo constraint after each addition. It’s a common pattern when solving run-length problems on string data.
Approach 2: Mathematical Counting Approach (O(n) time, O(1) space)
Instead of counting substrings incrementally, group consecutive identical characters and compute their contribution using a formula. If a run of the same character has length k, the number of homogenous substrings inside that run equals k * (k + 1) / 2. Iterate through the string, measure each run length, apply the formula, and add the result to the answer.
This method separates the counting logic from traversal and emphasizes the combinatorial insight behind the problem. It is effectively run-length encoding combined with a simple math formula. The runtime remains linear because each character is processed once, and only a few integer variables are maintained.
Recommended for interviews: The two-pointer method is usually the expected solution because it demonstrates strong control over iteration and incremental counting in two-pointer style scanning. The mathematical approach shows deeper understanding of how substring counts arise from run lengths, which can make the reasoning clearer. Both run in O(n) time with O(1) space and are considered optimal.
In this approach, we make use of two pointers to keep track of the start and end of a sequence of identical characters. As we iterate through the string, we update the end pointer when we find the same character, and on encountering a different character, we calculate the number of homogenous substrings formed using the formula for the sum of first n natural numbers: n * (n + 1) / 2, where n is the length of the sequence of identical characters. We repeat this process for each identified sequence and accumulate the total number of homogenous substrings.
In the C solution, we iterate through the string while maintaining the length of consecutive identical characters with the length variable. When a different character is encountered, we add the number of homogenous substrings formed by the previous sequence to count. The final count is taken modulo 10^9 + 7 as required.
Time Complexity: O(n), where n is the length of the string.
Space Complexity: O(1), as we only use a constant amount of extra space.
This approach involves counting the sequences of homogenous substrings by leveraging arithmetic progression's sum. By identifying the start and end of each homogenous substring part, we can determine the total count for those characters. We iterate through the string, find the size of each segment, and calculate total substrings using the arithmetic formula.
The C code initializes a result variable and iterates through the string to compute lengths of continuous characters. The formula for an arithmetic progression then calculates homogenous substrings from each segment.
Time Complexity: O(n) where n is the length of the string.
Space Complexity: O(1) since no extra space is used apart from fixed variables.
| Approach | Complexity |
|---|---|
| Two-Pointer Approach | Time Complexity: O(n), where n is the length of the string. |
| Mathematical Counting Approach | Time Complexity: O(n) where n is the length of the string. |
| Default Approach | — |
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Two-Pointer Run-Length Counting | O(n) | O(1) | Best general solution. Single pass with incremental counting, common interview pattern. |
| Mathematical Run-Length Formula | O(n) | O(1) | When you prefer grouping characters first and computing substrings using the k*(k+1)/2 formula. |
Count Number of Homogenous Substrings | Intuition | Math | Leetcode - 1759 • codestorywithMIK • 7,722 views views
Watch 9 more video solutions →Practice Count Number of Homogenous Substrings with our built-in code editor and test cases.
Practice on FleetCode