The variance of a string is defined as the largest difference between the number of occurrences of any 2 characters present in the string. Note the two characters may or may not be the same.
Given a string s consisting of lowercase English letters only, return the largest variance possible among all substrings of s.
A substring is a contiguous sequence of characters within a string.
Example 1:
Input: s = "aababbb" Output: 3 Explanation: All possible variances along with their respective substrings are listed below: - Variance 0 for substrings "a", "aa", "ab", "abab", "aababb", "ba", "b", "bb", and "bbb". - Variance 1 for substrings "aab", "aba", "abb", "aabab", "ababb", "aababbb", and "bab". - Variance 2 for substrings "aaba", "ababbb", "abbb", and "babb". - Variance 3 for substring "babbb". Since the largest possible variance is 3, we return it.
Example 2:
Input: s = "abcde" Output: 0 Explanation: No letter occurs more than once in s, so the variance of every substring is 0.
Constraints:
1 <= s.length <= 104s consists of lowercase English letters.Problem Overview: Given a string s, find a substring where the difference between the frequency of two characters is maximized. The variance of a substring is defined as count(a) - count(b) for two characters that both appear in the substring.
Approach 1: Kadane's Algorithm for Character Pair (O(26² · n) time, O(1) space)
The key observation: variance only depends on two characters at a time. Iterate over every ordered pair of characters (a, b). Treat occurrences of a as +1 and b as -1, ignoring other characters. Now the problem becomes a modified maximum subarray problem, which can be solved using Kadane’s algorithm. While scanning the string, track the running sum and ensure the substring includes at least one b. If the running score becomes negative and more b characters remain ahead, reset the window. This ensures we only keep promising candidates while respecting the requirement that both characters appear.
This approach effectively runs Kadane’s algorithm for all 26 × 26 ordered character pairs. Since each pass scans the string once, the total time complexity is O(26² · n), which is effectively linear for lowercase strings. The algorithm uses constant extra memory because only counters and running sums are stored. This technique combines ideas from Array traversal and dynamic score tracking similar to Dynamic Programming.
Approach 2: Sliding Window Technique (O(26² · n) time, O(1) space)
Another way to think about the problem is maintaining a window that contains two characters while maximizing their count difference. For every character pair (a, b), expand a window across the string and maintain counts of both characters. When the count of b dominates too heavily, shrink or reset the window since the variance would become negative and unlikely to produce a better answer. This behaves similarly to Kadane’s reset rule but is implemented with explicit window boundaries and counters.
The sliding window perspective is useful if you prefer thinking in terms of substring ranges rather than cumulative scores. Each iteration updates counts and computes count(a) - count(b) while ensuring both characters appear at least once. Although implemented differently, the time complexity remains O(26² · n) because the algorithm still evaluates every character pair and scans the string once per pair. Space usage remains O(1).
Recommended for interviews: The Kadane-based solution is what most interviewers expect. It demonstrates the ability to reduce a substring optimization problem into a maximum subarray variant and shows strong understanding of state resets and pairwise enumeration. Explaining the brute intuition (checking character pairs) first, then deriving the Kadane transformation, clearly signals strong problem-solving skills.
This approach leverages a modified version of Kadane's algorithm to find the largest variance by calculating frequency differences for each character pair combination. We'll use a variation of Kadane's approach to track the balance difference in frequencies between two characters across the string.
The solution involves iterating over all possible pairs of characters (a, b). For each pair, we calculate the difference in frequency counts using a variation of Kadane's algorithm approach to determine the maximum variance for that pair. The balance is reset when frequency of 'b' exceeds 'a', avoiding any negative variance.
The time complexity is O(26^2 * n) = O(n), where n is the length of the string, due to iterating over each character pair and traversing the string. The space complexity is O(1) because we use only a fixed amount of extra space.
The second approach is using the sliding window technique, which dynamically adjusts the window as we scan through the string. This helps to efficiently find all possible maximum variances for character combinations by maintaining two counters and extending or shrinking the window as needed.
In this C solution using the sliding window technique, we maintain a dynamic window while iterating over the string. The window adjusts dynamically to keep the variance optimal, expanding and contracting based on character balance.
Time complexity is O(n) due to efficient window management, spanning the string with a single pass, and space complexity veins at O(1) for predictable and limited use.
Since the character set only contains lowercase letters, we can consider enumerating the most frequent character a and the least frequent character b. For a substring, the difference in the number of occurrences of these two characters is the variance of the substring.
Specifically, we use a double loop to enumerate a and b. We use f[0] to record the number of consecutive occurrences of character a ending at the current character, and f[1] to record the variance of the substring ending at the current character and containing both a and b. We iterate to find the maximum value of f[1].
The recurrence formula is as follows:
a, then both f[0] and f[1] are incremented by 1;b, then f[1] = max(f[1] - 1, f[0] - 1), and f[0] = 0;Note that initially setting f[1] to a negative maximum value ensures that updating the answer is valid.
The time complexity is O(n times |\Sigma|^2), where n is the length of the string, and |\Sigma| is the size of the character set. The space complexity is O(1).
Python
Java
C++
Go
TypeScript
| Approach | Complexity |
|---|---|
| Kadane's Algorithm for Character Pair | The time complexity is O(26^2 * n) = O(n), where n is the length of the string, due to iterating over each character pair and traversing the string. The space complexity is O(1) because we use only a fixed amount of extra space. |
| Sliding Window Technique | Time complexity is O(n) due to efficient window management, spanning the string with a single pass, and space complexity veins at O(1) for predictable and limited use. |
| Enumeration + Dynamic Programming | — |
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Character Pair Enumeration (Conceptual Brute) | O(26² · n²) | O(1) | Useful for understanding the idea of checking variance for each character pair. |
| Kadane's Algorithm for Character Pair | O(26² · n) | O(1) | Optimal solution. Best for interviews and large inputs. |
| Sliding Window Technique | O(26² · n) | O(1) | Alternative implementation if you prefer window boundaries and frequency counters. |
Substring With Largest Variance | 2272 LeetCode | Maths | Leetcode Biweekly Contest 78 • CodeWithSunny • 12,719 views views
Watch 9 more video solutions →Practice Substring With Largest Variance with our built-in code editor and test cases.
Practice on FleetCode