Technical review: Document new web speech api features #41145

chrisdavidmills · 2025-09-15T11:38:16Z

Description

Chrome 142 desktop adds support for the on-device speech recognition functionality of the Web Speech API. See https://chromestatus.com/feature/6090916291674112.

This PR documents this new functionality as well as documenting a few related bits and making some needed updates.

Specifically, it:

Updates the "Using..." guide to add details of the new features
Adds docs for the new SpeechRecognition available() and install() methods.
Adds docs for the new SpeechRecognition phrases and processLocally properties.
Adds docs for the new SpeechRecognitionPhrase interface.
Adds docs for the new on-device-speech-recognition Permissions Policy directive.
Updates the SpeechRecognition.start() page to include the MediaStreamTrack parameter version.
Clearly marks the SpeechGrammar and SpeechGrammarList interfaces as deprecated and explains the story behind them.
Removes the SpeechRecognitionEvent emma and interpretation properties, as they have not been supported for about 5 years.
Adds ref pages for the SpeechRecognitionEvent and SpeechRecognitionErrorEvent constructors.

See mdn/dom-examples#332 for a related demo addition.

Motivation

Additional details

Related issues and pull requests

github-actions · 2025-09-15T11:40:26Z

Preview URLs (48 pages)

Flaws (7)

Note! 44 documents with no flaws that don't need to be listed. 🎉

URL: /en-US/docs/Web/API/SpeechRecognition/available_static
Title: SpeechRecognition: available() static method
Flaw count: 1

macros:
- Macro produces link /en-US/docs/Web/API/Promise which is a redirect

URL: /en-US/docs/Web/API/SpeechRecognition/grammars
Title: SpeechRecognition: grammars property
Flaw count: 2

unknown:
- Error serializing baseline for numeric-seperators: missing field description``
- Error serializing baseline for single-color-gradients: missing field description``

URL: /en-US/docs/Web/API/SpeechRecognition/install_static
Title: SpeechRecognition: install() static method
Flaw count: 1

macros:
- Macro produces link /en-US/docs/Web/API/Promise which is a redirect

URL: /en-US/docs/Web/HTTP/Reference/Headers/Permissions-Policy
Title: Permissions-Policy header
Flaw count: 3

unknown:
- No generic content config found
- no blog root
- no blog root

External URLs (10)

URL: /en-US/docs/Web/API/SpeechGrammar
Title: SpeechGrammar

https://www.w3.org/TR/jsgf/ (1 time) (Note! This may be a new URL 👀)

URL: /en-US/docs/Web/API/SpeechGrammarList
Title: SpeechGrammarList

https://www.w3.org/TR/jsgf/ (1 time) (Note! This may be a new URL 👀)

URL: /en-US/docs/Web/API/SpeechRecognition/available_static
Title: SpeechRecognition: available() static method

https://mdn.github.io/dom-examples/web-speech-api/speech-color-changer/ (1 time) (Note! This may be a new URL 👀)

URL: /en-US/docs/Web/API/SpeechRecognition/install_static
Title: SpeechRecognition: install() static method

https://mdn.github.io/dom-examples/web-speech-api/speech-color-changer/ (1 time) (Note! This may be a new URL 👀)

URL: /en-US/docs/Web/API/SpeechRecognition/phrases
Title: SpeechRecognition: phrases property

https://mdn.github.io/dom-examples/web-speech-api/speech-color-changer/ (1 time) (Note! This may be a new URL 👀)

URL: /en-US/docs/Web/API/SpeechRecognition/processLocally
Title: SpeechRecognition: processLocally property

https://mdn.github.io/dom-examples/web-speech-api/speech-color-changer/ (1 time) (Note! This may be a new URL 👀)

URL: /en-US/docs/Web/API/SpeechRecognition/start
Title: SpeechRecognition: start() method

https://mdn.github.io/dom-examples/web-speech-api/audio-track-recognition/ (1 time) (Note! This may be a new URL 👀)

URL: /en-US/docs/Web/API/SpeechRecognitionPhrase
Title: SpeechRecognitionPhrase

https://mdn.github.io/dom-examples/web-speech-api/speech-color-changer/ (1 time) (Note! This may be a new URL 👀)

URL: /en-US/docs/Web/API/Web_Speech_API/Using_the_Web_Speech_API
Title: Using the Web Speech API

https://mdn.github.io/dom-examples/web-speech-api/speak-easy-synthesis/ (1 time) (Note! This may be a new URL 👀)
https://mdn.github.io/dom-examples/web-speech-api/speech-color-changer/ (3 times) (Note! This may be a new URL 👀)

(comment last updated: 2025-09-19 11:23:48)

bsmth · 2025-09-16T07:51:16Z

Removing myself as reviewer for now

…nctional

…d interpretation property pages

evanbliu · 2025-09-17T21:28:46Z

files/en-us/web/api/speechrecognition/available_static/index.md

+
+{{APIRef("Web Speech API")}}
+
+The **`available()`** static method of the [Web Speech API](/en-US/docs/Web/API/Web_Speech_API) checks whether specified languages are available for speech recognition either locally on the user's computer, or via a remote service.


Technically, the API can be used to check if speech recognition is available matching a set of options. Right now, the only option other than language is processLocally, which can be used to guarantee that local recognition is used but can not be used to guarantee that remote service is used. If processLocally is false, speech recognition may happen anywhere.

OK, I think I get the issue here — the initial description was a little bit ambiguous. I've cut it down to just say "...checks whether specified languages are available for speech recognition", and then made sure that I explain clearly what the processLocally option does in the Parameters description below. I added a note to clarify that you can't use available() to determine whether a remote service is guaranteed to support the specified languages.

Let me know what you think of the updates.

Looks good to me!

evanbliu · 2025-09-17T21:29:48Z

files/en-us/web/api/speechrecognition/available_static/index.md

+A {{domxref("Promise")}} that resolves with an emumerated value indicating the availability of the specified languages for speech recognition. Possible values are:
+
+- `available`
+  - : Indicates that the specified languages are available. If `processLocally` is set to `true`, `available` means that the required language packs have been downloaded and installed on the user's computer. If `processLocally` is set to `false`, `available` means that speech recognition is availale for those languages either on-device or remotely.


availale->available

Good spot; fixed!

evanbliu · 2025-09-17T21:34:41Z

files/en-us/web/api/speechrecognition/available_static/index.md

+    - `processLocally` {{optional_inline}}
+      - : A boolean that specifies whether you are checking availability of the specified languages for [on-device speech recognition](/en-US/docs/Web/API/Web_Speech_API/Using_the_Web_Speech_API#on-device_speech_recognition) (`true`) or availability of the specified languages for on-device or remote speech recognition (`false`). Defaults to `false`.
+
+### Return value


It might be a good idea to document what happens when multiple languages are specified with different availability: https://webaudio.github.io/web-speech-api/#availability-algorithm

Can you clarify what you think needs explaining?

In the description of the different possible return values, I've explained that available is returned if support for all the languages is available, and unavailable is returned if at least one language is not supported.

I've done a bit of restructuring to make this clearer. Am I misunderstanding something?

The available() method takes in a collection of languages, so if a user calls it with en-US and fr-FR for example, en-US might be "available" but fr-FR might be "downloading", so the method would resolve with "downloading" since it only returns a single status for all of the languages specified. The section on the availability algorithm in the spec describes this behavior.

evanbliu · 2025-09-17T21:36:57Z

files/en-us/web/api/speechrecognition/install_static/index.md

+
+### Parameters
+
+- `options`


Technically, the install method takes in the same SpeechRecognitionOptions parameter has the available method. The behavior is undefined in the spec when install is called with processLocally=false though--we should probably fix that at some point.

Yeah, the way the spec is written makes it sound like "both take the same options object, but install() doesn't use the processLocally property", which made me think "in that case they should be defined as two separate dictionaries".

I decided to just not draw attention to it for now.

evanbliu · 2025-09-17T21:41:05Z

files/en-us/web/api/speechrecognition/phrases/index.md

+
+```js
+const phraseData = [
+  { phrase: "azure", boost: 10.0 },


According to the spec, "A float representing approximately the natural log of the number of times more likely the website thinks this phrase is than what the speech recognition model knows. A valid boost must be a float value inside the range [0.0, 10.0], with a default value of 1.0 if not specified. A boost of 0.0 means the phrase is not boosted at all, and a higher boost means the phrase is more likely to appear. A boost of 10.0 means the phrase is extremely likely to appear and should be rarely set."

A boost of 10.0 for the phrase "azure" might result in phrases erroneously recognized as "azure".

OK, understood. I've put this value down to 5.0 in all the source code listings. Is that more sensible?

I've also added a note to the SpeechRecognitionPhrase() constructor and boost property pages to warn about setting your boost values too high.

Sounds good!

evanbliu · 2025-09-17T21:44:18Z

files/en-us/web/api/speechrecognitionerrorevent/error/index.md

 - `audio-capture`
  - : Audio capture failed.
+- `bad-grammar`
+  - : There was an error in the speech recognition grammar or semantic tags, or the chosen


FYI, the bad-grammar error no longer appears in the Web Speech API spec.

Good to know. I've put the icons for deprecated and non-standard next to the error name, and added a note saying that it has been removed, along with the concept of grammar.

Do you know if it is still available in the browser implementation? If not, we could even just remove it altogether

Yeah, it still exists in the Chromium implementation.

evanbliu · 2025-09-17T21:49:38Z

files/en-us/web/api/web_speech_api/using_the_web_speech_api/index.md

+The Web Speech API has a main controller interface for this — {{domxref("SpeechRecognition")}} — plus several related interfaces for representing results, etc.

-The Web Speech API has a main controller interface for this — {{domxref("SpeechRecognition")}} — plus a number of closely-related interfaces for representing grammar, results, etc. Generally, the default speech recognition system available on the device will be used for the speech recognition — most modern OSes have a speech recognition system for issuing voice commands. Think about Dictation on macOS, Siri on iOS, Cortana on Windows 10, Android Speech, etc.
+Generally, the default speech recognition system available on the device will be used for the speech recognition — most modern OSes have a speech recognition system for issuing voice commands. Think about Dictation on macOS or Cortana on Windows. On some browsers, such as Chrome, using Speech Recognition on a web page involves a server-based recognition engine. Your audio is sent to a web service for recognition processing, so it won't work offline.


I think "Cortana" was rebranded as "Microsoft Copilot"

Chrome recently launched on-device Web Speech, so it would be more accurate to say that speech recognition "may" involve a server-based recognition. Or it might be better to just use a different example like Safari that always uses server-side speech recognition.

I've replaced Cortana with Copilot — shows how long ago these docs were originally written!

I've restructured this bit to avoid the server-based versus on-device issue you highlighted. Let me know what you think.

Document new web speech api features

1fa4ff9

chrisdavidmills requested review from a team as code owners September 15, 2025 11:38

chrisdavidmills requested review from bsmth and removed request for a team September 15, 2025 11:38

chrisdavidmills marked this pull request as draft September 15, 2025 11:38

github-actions bot added Content:WebAPI Web API docs Content:HTTP HTTP docs size/m [PR only] 51-500 LoC changed labels Sep 15, 2025

chrisdavidmills added 2 commits September 15, 2025 14:26

Add contextual biasing features

09b3279

Add ref pages for the on-device recognition features

63d178d

github-actions bot added size/l [PR only] 501-1000 LoC changed and removed size/m [PR only] 51-500 LoC changed labels Sep 15, 2025

bsmth removed their request for review September 16, 2025 07:51

Make it clear that the grammar functionality is deprecated and non-fu…

d269e6d

…nctional

github-actions bot added size/xl [PR only] >1000 LoC changed and removed size/l [PR only] 501-1000 LoC changed labels Sep 16, 2025

chrisdavidmills added 5 commits September 16, 2025 09:46

Add SpeechRecognitionEvent() constructor ref page, and remove emma an…

8ea7d00

…d interpretation property pages

Add SpeechRecognitionErrorEvent() constructor page

f6d2a3b

Adding details of start() method with MediaStreamTrack argument WIP

10509c0

Finish documenting start(audioTrack)

c919c9a

Update example code for consistency with dom-examples demo updates

150e3b1

chrisdavidmills changed the title ~~Document new web speech api features~~ Technical review: Document new web speech api features Sep 17, 2025

evanbliu reviewed Sep 17, 2025

View reviewed changes

Fixes for evanbliu review comments

3fabb76

chrisdavidmills requested a review from evanbliu September 19, 2025 11:14

chrisdavidmills marked this pull request as ready for review September 19, 2025 11:21

chrisdavidmills requested review from a team as code owners September 19, 2025 11:21

chrisdavidmills requested review from dipikabh and removed request for evanbliu and a team September 19, 2025 11:21

Merge branch 'main' into on-device-speech-recognition

84f4f31

sideshowbarker removed the request for review from a team September 21, 2025 05:27


		{{APIRef("Web Speech API")}}

		The `available()` static method of the [Web Speech API](/en-US/docs/Web/API/Web_Speech_API) checks whether specified languages are available for speech recognition either locally on the user's computer, or via a remote service.

Technical review: Document new web speech api features #41145

Are you sure you want to change the base?

Technical review: Document new web speech api features #41145

Conversation

chrisdavidmills commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation

Additional details

Related issues and pull requests

Uh oh!

github-actions bot commented Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bsmth commented Sep 16, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chrisdavidmills Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

evanbliu Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chrisdavidmills commented Sep 15, 2025 •

edited

Loading

github-actions bot commented Sep 15, 2025 •

edited

Loading

chrisdavidmills Sep 19, 2025 •

edited

Loading

evanbliu Sep 17, 2025 •

edited

Loading