
Sponsored
Sponsored
Utilize bitmasks to represent words and puzzles. This allows for quick inclusion checks (using bitwise AND) and processing of the large dataset efficiently.
We construct a bitmask for each word and store it in a dictionary with counts. For each puzzle, generate all possible submasks (the example uses a different method) and check if they are present in the preprocessed dictionary.
Time Complexity: O(W + P * 27), where W is the number of words and P is the number of puzzles.
Space Complexity: O(W), due to storing the bitmask of each word.
1def find_num_of_valid_words(words, puzzles):
2 from collections import defaultdict
3 word_count = defaultdict(int)
4 def bitmask(word):
5 bm = 0
6 for char in set(word):
7 bm |= 1 << (ord(char) - ord('a'))
8 return bm
9
10 for word in words:
11 word_count[bitmask(word)] += 1
12
13 results = []
14
15 for puzzle in puzzles:
16 first = 1 << (ord(puzzle[0]) - ord('a'))
17 puzzle_mask = bitmask(puzzle)
18 count = 0
19 submask = puzzle_mask
20
21 while submask:
22 if submask & first:
23 count += word_count[submask]
24 submask = (submask - 1) & puzzle_mask
25
26 results.append(count)
27
28 return results
29
30# Example usage
31words = ["aaaa","asas","able","ability","actt","actor","access"]
32puzzles = ["aboveyz","abrodyz","abslute","absoryz","actresz","gaswxyz"]
33print(find_num_of_valid_words(words, puzzles))The solution involves the following steps:
Loop through each puzzle and for each puzzle, check every word in the word list to see if it is valid. This approach is straightforward but inefficient for larger datasets.
Time Complexity: O(W * P * L), where W is the number of words, P is the number of puzzles, and L is the average length of the words.
Space Complexity: O(1), as no additional data structures are used that scale with input size.
1def find_num_of_valid_words(words, puzzles):
2 results = []
3 def
Using a bitmask, we can represent the presence of each letter in a word or puzzle as a single integer. Each bit in this integer corresponds to a letter from 'a' to 'z'.
For each word, compute its bitmask and store it in a hash map with its frequency. When processing puzzles, derive all potential subsets of the puzzle's bitmask that contain the first letter of the puzzle. Use the precomputed hash map to count how many words have matching character sets.
Time Complexity: O(N + P * 2^L), where N is the number of words, P is the number of puzzles, and L is the max number of letters in the puzzle (L = 7), resulting in 2^L subsets.
Space Complexity: O(N) for storing bitmasks and their frequencies.
1def findNumOfValidWords(words, puzzles):
This approach uses direct validation of each word against each puzzle by checking the conditions separately using set operations. Early stopping is applied if any character not in the puzzle is found or if the word doesn't contain the first puzzle character.
Time Complexity: O(P * N * L), where P is the number of puzzles, N is the number of words, and L is the average length of each word.
Space Complexity: O(1) additional space besides input storage.
1import java.util.*;
2
3public class Solution
Check each word against each puzzle, ensuring the word contains the first letter of the puzzle and no characters not in the puzzle. This approach is easy to understand but computationally expensive.
This solution first converts each word into a bitmask representing the presence of its characters using a get_bitmask function. The word_count dictionary stores the number of words for each bitmask.
For each puzzle, it generates the bitmask and processes each subset that includes the first puzzle letter using bit manipulation. It checks each subset against precomputed word masks for potential matches and counts them. This avoids unnecessary recomputation and efficiently selects valid words.
This Java solution leverages set membership tests for characters to ensure each word meets the puzzle's requirements. For each puzzle, it tracks characters in a set and verifies if each word abides by puzzle constraints, including early stopping.
The solution is less efficient for larger input sizes but straightforward to understand and implement directly.