A sentence consists of lowercase letters ('a' to 'z'), digits ('0' to '9'), hyphens ('-'), punctuation marks ('!', '.', and ','), and spaces (' ') only. Each sentence can be broken down into one or more tokens separated by one or more spaces ' '.
A token is a valid word if all three of the following are true:
'-'. If present, it must be surrounded by lowercase characters ("a-b" is valid, but "-ab" and "ab-" are not valid)."ab,", "cd!", and "." are valid, but "a!b" and "c.," are not valid).Examples of valid words include "a-b.", "afad", "ba-c", "a!", and "!".
Given a string sentence, return the number of valid words in sentence.
Example 1:
Input: sentence = "cat and dog" Output: 3 Explanation: The valid words in the sentence are "cat", "and", and "dog".
Example 2:
Input: sentence = "!this 1-s b8d!" Output: 0 Explanation: There are no valid words in the sentence. "!this" is invalid because it starts with a punctuation mark. "1-s" and "b8d" are invalid because they contain digits.
Example 3:
Input: sentence = "alice and bob are playing stone-game10" Output: 5 Explanation: The valid words in the sentence are "alice", "and", "bob", "are", and "playing". "stone-game10" is invalid because it contains digits.
Constraints:
1 <= sentence.length <= 1000sentence only contains lowercase English letters, digits, ' ', '-', '!', '.', and ','.1 token.Problem Overview: Given a sentence containing lowercase letters, digits, spaces, hyphens, and punctuation (! , .), count how many tokens are valid words. A valid word contains only lowercase letters, may include at most one hyphen surrounded by letters, and may end with a single punctuation mark. Tokens cannot contain digits.
The challenge is mostly careful string validation. You split the sentence by spaces and verify each token against the rules. The tricky parts are handling the hyphen placement and ensuring punctuation only appears at the end.
Approach 1: Simple Iterative Validation (O(n) time, O(1) space)
Split the sentence into tokens using whitespace and validate each token character by character. Track whether a hyphen or punctuation has already appeared. When iterating, reject the token if you encounter a digit, more than one hyphen, or punctuation in the middle of the word. If a hyphen appears, verify the characters before and after it are lowercase letters. If punctuation appears, ensure it is the final character. This approach uses direct character checks and a few boolean flags, which keeps memory constant.
The key insight: every rule can be validated in a single pass over the token. Since each character is processed exactly once, the total runtime across the sentence is linear. This method relies purely on basic string processing and conditional checks, which makes it fast and easy to implement in languages like C++, Java, or Python.
Approach 2: Regular Expression Matching (O(n) time, O(1) space)
A compact alternative is to encode the rules in a regular expression. After splitting the sentence into tokens, match each token against a pattern describing valid words. A typical pattern allows lowercase letters, an optional internal hyphen surrounded by letters, and an optional punctuation mark at the end. If the token matches the pattern, count it as valid.
This approach shifts the validation logic into the regex engine. The pattern effectively describes the grammar of a valid token, reducing manual checks in code. Performance remains linear relative to the sentence length because each token is matched once. It is especially concise in Python or JavaScript where regex support is straightforward.
Recommended for interviews: The iterative validation approach is usually preferred. It demonstrates that you can translate problem constraints into precise character checks and control flow. The regex solution is elegant but hides the reasoning inside a pattern, which some interviewers consider less explicit. Showing the manual validation first proves you understand the rules; mentioning the regex alternative shows broader familiarity with string parsing techniques.
This approach involves splitting the sentence into tokens and iterating over each token to validate it based on the given criteria. This is a straightforward method that checks each character of a token to determine its validity.
This solution uses the standard C library functions to split the sentence into tokens using spaces. It then checks each token for validity by ensuring it contains no digits, at most one hyphen surrounded by letters, and at most one punctuation mark at the end.
Time Complexity: O(n), where n is the length of the sentence.
Space Complexity: O(n), due to the necessity to duplicate the string for tokenization.
This approach leverages regular expressions to validate words by matching each token against a pre-defined pattern. It simplifies character validation and incorporates the constraints into a single expression.
The Python code uses regular expressions to define a valid word pattern: no digits allowed, optional hyphen between lowercase letters, and an optional punctuation mark at the end. It splits the sentence into tokens and uses the `re.match` to filter valid words.
Python
JavaScript
Time Complexity: O(n), where n is the length of the sentence.
Space Complexity: O(n), considering token splitting and regex storage.
First, we split the sentence into words by spaces, and then check each word to determine if it is a valid word.
For each word, we can use a boolean variable st to record whether a hyphen has already appeared, and then traverse each character in the word, judging according to the rules described in the problem.
For each character s[i], we have the following cases:
s[i] is a digit, then s is not a valid word, and we return false directly;s[i] is a punctuation mark ('!', '.', ','), and i < len(s) - 1, then s is not a valid word, and we return false directly;s[i] is a hyphen, then we need to check if the following conditions are met:s[i] is a letter, then we do not need to do anything.Finally, we count the number of valid words in the sentence.
The time complexity is O(n), and the space complexity is O(n). Here, n is the length of the sentence.
Python
Java
C++
Go
TypeScript
| Approach | Complexity |
|---|---|
| Simple Iterative Approach | Time Complexity: O(n), where n is the length of the sentence. |
| Regular Expression Approach | Time Complexity: O(n), where n is the length of the sentence. |
| Simulation | — |
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Simple Iterative Validation | O(n) | O(1) | Best general solution. Clear rule checking and preferred in interviews. |
| Regular Expression Matching | O(n) | O(1) | Useful when regex support is strong and you want concise validation logic. |
2047. Number of Valid Words in a Sentence | LEETCODE WEEKLY CONTEST 264 | CODE EXPLAINER • code Explainer • 2,057 views views
Watch 8 more video solutions →Practice Number of Valid Words in a Sentence with our built-in code editor and test cases.
Practice on FleetCodePractice this problem
Open in Editor