You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/speech-to-text/features/audio-filtering.mdx
+12-10Lines changed: 12 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,5 @@
1
1
---
2
-
description: "Learn how to utilize Audio Filtering to remove background speech"
2
+
description: "Learn how to utilize audio filtering to remove background speech"
3
3
keywords:
4
4
[
5
5
speechmatics,
@@ -15,19 +15,19 @@ keywords:
15
15
importTabsfrom"@theme/Tabs";
16
16
importTabItemfrom"@theme/TabItem";
17
17
18
-
# Audio Filtering
18
+
# Audio filtering
19
19
20
-
Audio Filtering pre-processes input audio to remove low-volume background speech which might otherwise be detected and transcribed.
20
+
Audio filtering pre-processes input audio to remove low-volume background speech which might otherwise be detected and transcribed.
21
21
22
22
:::info
23
-
This can be useful, for example, in a call center to avoid transcribing other agents' speech from the background.
23
+
This can be useful, for example, in a call center to avoid transcribing other agents' background speech.
24
24
:::
25
25
26
-
If you're new to Speechmatics, start by exploring our guides on [Transcribing a File](/speech-to-text/batch/quickstart) or [Transcribing in Real-Time](/speech-to-text/realtime/quickstart).
26
+
If you're new to Speechmatics, start by exploring our guides on [transcribing a file](/speech-to-text/batch/quickstart) or [transcribing in real-time](/speech-to-text/realtime/quickstart).
27
27
28
28
## Example
29
29
30
-
To activate Audio Filtering, include the following configuration:
30
+
To activate audio filtering, include the following configuration:
31
31
32
32
```json
33
33
{
@@ -41,13 +41,15 @@ To activate Audio Filtering, include the following configuration:
41
41
}
42
42
}
43
43
```
44
-
This will avoid processing any audio which is below the `3.4` volume threshold. For technical details on how this threshold is used see [here](#technical-details)
44
+
This will avoid processing any audio which is below the `3.4` volume threshold. For technical details on how this threshold is calculated and used, see [here](#technical-details)
45
45
46
46
`volume_threshold` supports a range of `0 - 100` where `0` does not filter any audio and `100` removes all audio.
47
47
48
-
## Volume Labelling
48
+
In realtime mode, the threshold can be adjusted dynamically with the [SetRecognitionConfig](/api-ref/realtime-transcription-websocket#setrecognitionconfig) message.
49
49
50
-
If Audio Filtering is configured, words will be labelled with their volume like this (range for `volume_threshold` is `0-100`):
50
+
## Volume labelling
51
+
52
+
If audio filtering is configured, words will be labelled with their volume like this (the range for `volume_threshold` is `0-100`):
51
53
52
54
```json
53
55
{
@@ -69,7 +71,7 @@ These values can be used as a guide to setting the volume threshold, but we reco
69
71
70
72
To obtain volume labelling without filtering any audio, supply an empty config object (`{}`) or set the `volume_threshold` to `0.0`.
71
73
72
-
## Technical Details
74
+
## Technical details
73
75
74
76
Once the audio is in a raw format (16kHz 16bit mono), it is split into 0.01s chunks. For each chunk, the root mean square amplitude of the signal is calculated, and scaled to the range `0 - 100`. If the volume is less than the supplied cut-off, the chunk will be replaced with silence.
Copy file name to clipboardExpand all lines: spec/realtime.yaml
+2Lines changed: 2 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -1067,6 +1067,7 @@ components:
1067
1067
| `data_error` | Unable to accept the data specified - usually because there is too much data being sent at once |
1068
1068
| `buffer_error` | Unable to fit the data in a corresponding buffer. This can happen for clients sending the input data faster than real-time. |
1069
1069
| `protocol_error` | Message received was syntactically correct, but could not be accepted due to protocol limitations. This is usually caused by messages sent in the wrong order. |
1070
+
| `start_recognition_timeout` | The timeout for sending StartRecognition has been exceeded (SaaS only) |
1070
1071
| `quota_exceeded` | Maximum number of concurrent connections allowed for the contract has been reached |
1071
1072
| `timelimit_exceeded` | Usage quota for the contract has been reached |
1072
1073
| `idle_timeout` | Idle duration limit was reached (no audio data sent within the last hour), a closing handshake with code 1008 follows this in-band error. |
0 commit comments