-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathbacklog.taskpaper
53 lines (48 loc) · 2.83 KB
/
backlog.taskpaper
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
Add import cmd that allows binding chapters not produced by lego:
- cmd support folder parameter
- does everything except for synthesis
- rename chapters to chapter-paragraph scheme
- generate chapters list
- (everything else should just flow from that)
Archive:
Text processor (divido): @done(2023-02-18)
- Only support .txt files @done(2023-02-18)
- Add documentation to support converting other formats to .txt (https://textract.readthedocs.io/en/stable/) @done(2023-02-18)
- Calibre @done(2023-02-18)
- Break large text files into >5000 characters segments: @done(2023-02-18)
- Break text into "chapters" @done(2023-02-18)
- Break large chunk into "paragraphs" @done(2023-02-18)
- If needed, break "paragraphs" into "sentences" @done(2023-02-18)
- Validate that anticipated payload would not have "sentences" more than 5000 characters (or whatever is the limit with other data) @done(2023-02-18)
Improve results with SSML: @done(2023-02-18)
- https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-synthesis-markup-structure @done(2023-02-18)
- Add paragraph, sentence markers @done(2023-02-18)
- Replace Significant breaks with pauses @done(2023-02-18)
- etc @done(2023-02-18)
Azure Cognitive Services text-to-speech API Client: @done(2023-02-18)
- https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/index-text-to-speech @done(2023-02-18)
- Batch Synthesis client: https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/batch-synthesis @done(2023-02-18)
ffmpeg integration: @done(2023-02-18)
- generate ffmpeg command to concatenate all chunks into a single m4b audiobook @done(2023-02-18)
- figure out metadata @done(2023-02-18)
- should work with LitRes mp3 to concatenate them @done(2023-02-18)
lego app: @done(2023-02-18)
- add clo cmds and handlers: @done(2023-02-18)
- list-voices @done(2023-02-18)
- synthesize @done(2023-02-18)
- split .txt file into chapters - paragraphs - sentences @done(2023-02-18)
- use GCP TTS to convert to OGG @done(2023-02-18)
- concatenate with ffmpeg @done(2023-02-18)
- add metadata @done(2023-02-18)
- bind @done(2023-02-18)
- concatenate mp3s to m4b @done(2023-02-18)
- (replace AudioBookBinder in my own setup) @done(2023-02-18)
GCP Text-to-Speech API Client (google-tts-integration): @done(2023-01-03)
- Wrap APIs: @done(2023-01-03)
- https://cloud.google.com/text-to-speech @done(2023-01-03)
- https://cloud.google.com/text-to-speech/docs/reference/rest/v1/voices/list @done(2023-01-03)
- https://cloud.google.com/text-to-speech/docs/reference/rest/v1/text/synthesize @done(2023-01-03)
- Provide sane defaults @done(2023-01-03)
- Add API key hook @done(2023-01-03)
- Validate 5000 characters limit @done(2023-01-03)
- Get/write out OGG media @done(2023-01-03)