This is the simple bot that converts voice into text. I tried to review available public speech recognition services and the results you can see below in the table.
- Typescript
- Fastify
- Axios
- PostgreSQL
- Google Analytics
- Amplitude
flowchart BT
subgraph tg[Telegram]
voice[Voice message]
audio[Audio]
video[Video note]
text[Text message]
bot[AudioMessBot API]
end
subgraph cluster[Replicas]
r1{{Replica 1}}
ar{{Active replica}}
r2{{Replica N}}
end
voice-->bot
audio-->bot
video-->bot
bot-->text
bot---ar
ar---db[(PSQL\nDatabase)]
ar---cloud((Cloud API provider))
Service provider | Russian lang | Synchronous API | Duration limitation | File upload | Speed |
---|---|---|---|---|---|
IBM Watson | no | no | N/A | Unknown | Unknown |
Microsoft Azure | no | no | N/A | Unknown | Unknown |
Amazon AWS | yes | no | Unlimited | S3 | Minutes |
Google Cloud | yes | yes | 1 minute*[1] | Direct / GDrive | Instant*[2] |
Wit.ai | yes | yes | 5 minutes | Direct | Instant |
- For direct upload
1 Unlimited for asynchronous upload via Google Drive
2 Takes a while for asynchronous upload via Google Drive