-
Notifications
You must be signed in to change notification settings - Fork 38
Add unspokenPunctuation to SpeechRecognition (#187) #188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
evanbliu
merged 4 commits into
WebAudio:main
from
evanbliu:feature/unspoken-punctuation
May 21, 2026
+65
−0
Merged
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
1a0e44a
Add unspokenPunctuation to SpeechRecognition (#187)
evliu-google 037bc75
Add explainer for unspokenPunctuation
evliu-google e18102b
Document automatic capitalization with unspoken punctuation
evanbliu 37ab956
Enhance unspoken punctuation documentation
evanbliu File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,60 @@ | ||
| # Explainer: Unspoken Punctuation for the Web Speech API | ||
|
|
||
| ## Introduction | ||
|
|
||
| The Web Speech API provides powerful speech recognition capabilities to web applications. However, continuous speech often lacks explicit spoken punctuation commands. When building casual voice typing tools, automated transcription services, and conversational assistants, developers frequently need to post-process the raw text to insert punctuation, making it readable and natural. | ||
|
|
||
| To address this, we introduce **unspoken punctuation** to the Web Speech API. This feature allows developers to configure the speech recognition engine to automatically infer and insert punctuation marks (such as periods, commas, and question marks) based on natural pauses, grammatical structure, and prosody, without requiring the user to explicitly speak the punctuation commands (e.g. saying "period" or "comma"). | ||
|
|
||
| ## Why Use Unspoken Punctuation? | ||
|
|
||
| ### 1. **Customization for Different Use Cases** | ||
| Different speech recognition contexts require distinct handling of text flow. Casual voice typing, automated transcription, and conversational assistants greatly benefit from automatic punctuation to produce readable, polished text out of the box. Conversely, precise dictation tools, coding via voice, or raw acoustic logging applications may require verbatim, unpunctuated streams where punctuation is strictly controlled by explicit user commands. | ||
|
|
||
| ### 2. **Enhanced User Experience** | ||
| Natural, continuous speech often lacks explicit spoken punctuation commands. Allowing developers to enable automatic punctuation lowers the cognitive load for end-users, making voice input feel more intuitive and conversational while saving developers from implementing complex downstream NLP models to handle basic text formatting. | ||
|
evanbliu marked this conversation as resolved.
|
||
|
|
||
| ## New API Components | ||
|
|
||
| The unspoken punctuation feature is implemented through a new `unspokenPunctuation` boolean attribute on the `SpeechRecognition` interface. | ||
|
|
||
| ### `SpeechRecognition.unspokenPunctuation` attribute | ||
| This boolean attribute controls whether the speech recognition engine automatically infers and inserts punctuation marks. | ||
|
|
||
| - When set to `true`, the user agent should automatically insert punctuation based on natural pauses and grammatical structure. | ||
|
evanbliu marked this conversation as resolved.
|
||
| - When set to `false`, the user agent must not insert unspoken punctuation, requiring the user to explicitly dictate punctuation commands. | ||
| - The default value is `false` to maintain backward compatibility with existing applications and ensure deterministic, unformatted text outputs unless explicitly opted into by the developer. | ||
|
|
||
| ## Example Usage | ||
|
|
||
| ```javascript | ||
| const recognition = new SpeechRecognition(); | ||
|
|
||
| // Enable unspoken punctuation | ||
| recognition.unspokenPunctuation = true; | ||
|
evanbliu marked this conversation as resolved.
|
||
|
|
||
| // Configure other settings | ||
| recognition.continuous = true; | ||
| recognition.interimResults = true; | ||
| recognition.lang = 'en-US'; | ||
|
|
||
| recognition.onresult = (event) => { | ||
| for (let i = event.resultIndex; i < event.results.length; ++i) { | ||
| if (event.results[i].isFinal) { | ||
| console.log('Final Result with Punctuation: ', event.results[i][0].transcript); | ||
| // Example output: "Hello there, how are you today?" | ||
| // Instead of: "hello there how are you today" | ||
| } else { | ||
| console.log('Interim Result: ', event.results[i][0].transcript); | ||
| } | ||
| } | ||
| }; | ||
|
|
||
| recognition.start(); | ||
| ``` | ||
|
|
||
| ### Note on Automatic Capitalization | ||
| In many modern speech-to-text engines, automatic punctuation is tightly coupled with automatic capitalization. When `unspokenPunctuation` is set to `true`, developers should expect that the underlying recognition engine may also automatically capitalize the first word following an inferred sentence-ending punctuation mark (e.g., a period or question mark). Because this behavior depends on the specific platform and OS implementation, developers should not assume the resulting text will remain strictly lowercase when this flag is enabled. | ||
|
|
||
| ### Internationalization | ||
| Punctuation and spacing rules vary by language (e.g., `¿` in Spanish). When `unspokenPunctuation` is enabled, the specific formatting is implementation-dependent. The underlying engine is expected to apply the correct localized rules based on the `SpeechRecognition.lang` attribute. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.