This approach utilizes a reversed Trie (prefix tree) to efficiently manage and query suffixes of input characters. By storing the reversed words in a Trie, we can match suffixes of the input stream by querying from the current character backwards. For each character queried, we move through the Trie from those characters up to a certain limit (or until no further match) to see if any of the suffixes match one of the stored words.
Time Complexity for insertion: O(N*L), where N is the number of words and L is the maximum length of a word.
Time Complexity for queries: O(Q*L) where Q is the number of queries and L is the maximum length of a word.
Space Complexity: O(N*L) for storing the words in the Trie.
1class TrieNode:
2 def __init__(self):
3 self.children = {}
4 self.is_end_of_word = False
5
6class StreamChecker:
7 def __init__(self, words):
8 self.root = TrieNode()
9 self.stream = []
10 for word in words:
11 node = self.root
12 for char in reversed(word):
13 if char not in node.children:
14 node.children[char] = TrieNode()
15 node = node.children[char]
16 node.is_end_of_word = True
17
18 def query(self, letter):
19 self.stream.append(letter)
20 node = self.root
21 for char in reversed(self.stream):
22 if node.is_end_of_word:
23 return True
24 if char not in node.children:
25 return False
26 node = node.children[char]
27 return node.is_end_of_word
The implementation involves creating a Trie that stores each word in reverse. When a query is made, characters are added to a list representing the current stream. We then check from the current end of the stream backwards through the Trie to determine if any suffix matches a word.
The Aho-Corasick algorithm is useful for searching multiple 'dictionary' words within a text. We build an automaton with all words and query the stream letters. This data structure enables matching of all dictionary words in parallel through a finite state machine that jumps quickly between states even upon mismatches.
Time Complexity for building: O(N*L), because we insert and construct failure links.
Time Complexity for queries: O(Q*L) leveraging finite state machine transitions.
Space Complexity: O(N*L) due to the state-space storage representing words.
1#include <vector>
2#include <queue>
3#include <unordered_map>
4using namespace std;
5
6class AhoCorasick {
7public:
8 struct TrieNode {
9 unordered_map<char, TrieNode*> children;
10 TrieNode* fail = nullptr;
11 bool is_end = false;
12 };
13
14 TrieNode *root = new TrieNode();
15
16 void insert(const string &word) {
17 TrieNode *node = root;
18 for(char c : word) {
19 if(!node->children.count(c)) {
20 node->children[c] = new TrieNode();
21 }
22 node = node->children[c];
23 }
24 node->is_end = true;
25 }
26
27 void build() {
28 queue<TrieNode*> q;
29 root->fail = root;
30 for(auto &p : root->children) {
31 p.second->fail = root;
32 q.push(p.second);
33 }
34 while(!q.empty()) {
35 TrieNode *curr = q.front(); q.pop();
36 for(auto &p : curr->children) {
37 char c = p.first;
38 TrieNode *child = p.second;
39 TrieNode *fail = curr->fail;
40 while(fail != root && !fail->children.count(c)) {
41 fail = fail->fail;
42 }
43 if(fail->children.count(c)) {
44 child->fail = fail->children[c];
45 } else {
46 child->fail = root;
47 }
48 child->is_end |= child->fail->is_end;
49 q.push(child);
50 }
51 }
52 }
53};
54
55class StreamChecker {
56 AhoCorasick ac;
57 AhoCorasick::TrieNode* node;
58 public:
59 StreamChecker(vector<string>& words) {
60 for(const string &word : words) {
61 ac.insert(word);
62 }
63 ac.build();
64 node = ac.root;
65 }
66
67 bool query(char letter) {
68 while(node != ac.root && !node->children.count(letter)) {
69 node = node->fail;
70 }
71 if(node->children.count(letter)) {
72 node = node->children[letter];
73 }
74 return node->is_end;
75 }
76};
This C++ solution implements the Aho-Corasick algorithm: