Given a string s and an array of strings words, return the number of words[i] that is a subsequence of s.
A subsequence of a string is a new string generated from the original string with some characters (can be none) deleted without changing the relative order of the remaining characters.
"ace" is a subsequence of "abcde".Example 1:
Input: s = "abcde", words = ["a","bb","acd","ace"] Output: 3 Explanation: There are three strings in words that are a subsequence of s: "a", "acd", "ace".
Example 2:
Input: s = "dsahjpjauf", words = ["ahjpjau","ja","ahbwzgqnuk","tnmlanowax"] Output: 2
Constraints:
1 <= s.length <= 5 * 1041 <= words.length <= 50001 <= words[i].length <= 50s and words[i] consist of only lowercase English letters.To solve #792 Number of Matching Subsequences, the main challenge is efficiently checking whether each word in the list is a subsequence of the main string s. A naive approach would check every word character by character against s, but this becomes slow when the list of words is large.
A more efficient strategy is to preprocess the indices of characters in the main string using a hash map where each character maps to a sorted list of positions. Then, for each word, you can use binary search to find the next valid index in s that maintains subsequence order. This significantly speeds up matching.
Another optimized technique groups words by the character they are waiting for and processes them while scanning s. These approaches reduce repeated scanning and improve scalability for large inputs.
The optimized solutions typically achieve O(n + total_chars * log n) time using binary search or near O(n + total_chars) with bucket-based processing, while using additional space for indexing.
| Approach | Time Complexity | Space Complexity |
|---|---|---|
| Index Mapping + Binary Search | O(n + k log n) | O(n) |
| Bucket/Waiting List Technique | O(n + k) | O(n + k) |
NeetCode
Brute Force Approach: In this approach, for each word in the words array, we will check whether it is a subsequence of the string s or not. This is achieved by iterating over the characters of the word and trying to match them sequentially in the string s.
Time Complexity: O(n * m), where n is the length of the string s and m is the average length of the words. Space Complexity: O(1) since it only uses constant extra space.
1#include <vector>
2#include <string>
3
4bool isSubsequence(const std::string& s, const std::string& word) {
5 int i = 0, j = 0;
6 while (i < s.size() && j < word.size()) {
7 if (s[i] == word[j]) {
8 j++;
9 }
10 i++;
11 }
12 return j == word.size();
13}
14
int numMatchingSubseq(const std::string& s, const std::vector<std::string>& words) {
int count = 0;
for (const auto& word : words) {
if (isSubsequence(s, word)) {
count++;
}
}
return count;
}This solution uses a helper function isSubsequence to check each word. For every word, if it is a subsequence of s, the count is incremented. The logic is simple and checks the characters one by one.
Efficient Group Matching: Group words by their initial characters in a dictionary and iterate over the string s to manage these groups of words, repositioning them towards completion as their current character is matched. This avoids re-checking each word from the beginning every time.
Time Complexity: O(n + m), where n is the length of s and m is the total number of characters across all words. Space Complexity: O(m) due to storage of iterators.
1from collections import defaultdict
2
3defWatch expert explanations and walkthroughs
Practice problems asked by these companies to ace your technical interviews.
Explore More ProblemsJot down your thoughts, approach, and key learnings
Preprocessing the indices of characters in the main string allows faster lookups when verifying subsequences. Instead of scanning the entire string repeatedly, binary search can quickly find the next valid character position.
Yes, this problem is a popular medium-level string and subsequence question that tests optimization techniques. It often appears in interviews at large tech companies because it evaluates knowledge of hashing, binary search, and efficient string processing.
Commonly used data structures include hash maps for character indexing, arrays or lists for storing positions, and queues or buckets for grouping words waiting for specific characters. These structures help reduce repeated scans of the main string.
One optimal approach preprocesses the main string by storing indices of each character and then uses binary search to validate each word as a subsequence. Another efficient method uses a bucket system where words wait for their next required character while scanning the string.
In this Python approach, we store word iterators in a dictionary waiting, initially keyed by their starting character. As we advance through s, we update the waiting list for each character to track partially matched words, dramatically improving efficiency.