We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
看一下下面的日志,前面使用cuda,超快的处理了声音,然后到了开始transcription的时候,开始速度很快,然后越来越慢。
MDX Kim_Vocal_2 model probably is running on CUDA: 99% | 276/280 | 01:05<<00:00 MDX Kim_Vocal_2 model probably is running on CUDA: 99% | 277/280 | 01:05<<00:00 MDX Kim_Vocal_2 model probably is running on CUDA: 99% | 278/280 | 01:06<<00:00 MDX Kim_Vocal_2 model probably is running on CUDA: 100% | 279/280 | 01:06<<00:00 MDX Kim_Vocal_2 model probably is running on CUDA: 100% | 280/280 | 01:06<<00:00 MDX Kim_Vocal_2 model probably is running on CUDA: 100% | 280/280 | 01:07<<00:00 Starting transcription on: C:\Users\leo\AppData\Local\Temp\bk_asr\tmppif13f8k\audio.wav 1% | 29/4200 | 00:01<<03:27 | 20.08 audio seconds/s 2% | 69/4200 | 00:02<<02:30 | 27.51 audio seconds/s 2% | 99/4200 | 00:04<<02:46 | 24.57 audio seconds/s 3% | 129/4200 | 00:05<<02:52 | 23.64 audio seconds/s 4% | 162/4200 | 00:06<<02:47 | 24.16 audio seconds/s 5% | 200/4200 | 00:08<<02:41 | 24.79 audio seconds/s 6% | 233/4200 | 00:10<<02:55 | 22.54 audio seconds/s 6% | 265/4200 | 00:11<<02:54 | 22.49 audio seconds/s 7% | 295/4200 | 00:13<<03:02 | 21.40 audio seconds/s 8% | 321/4200 | 00:15<<03:07 | 20.66 audio seconds/s 8% | 353/4200 | 00:24<<04:24 | 14.52 audio seconds/s 9% | 375/4200 | 00:26<<04:25 | 14.39 audio seconds/s 10% | 403/4200 | 00:27<<04:23 | 14.40 audio seconds/s 10% | 433/4200 | 00:29<<04:18 | 14.55 audio seconds/s 11% | 462/4200 | 00:45<<06:08 | 10.15 audio seconds/s 12% | 489/4200 | 00:47<<05:58 | 10.36 audio seconds/s 12% | 521/4200 | 00:58<<06:55 | 8.85 audio seconds/s 13% | 546/4200 | 00:59<<06:40 | 9.13 audio seconds/s 14% | 574/4200 | 01:09<<07:18 | 8.27 audio seconds/s 14% | 605/4200 | 01:18<<07:46 | 7.71 audio seconds/s 15% | 632/4200 | 01:20<<07:32 | 7.89 audio seconds/s 16% | 660/4200 | 01:26<<07:42 | 7.65 audio seconds/s 16% | 692/4200 | 01:27<<07:25 | 7.87 audio seconds/s 17% | 720/4200 | 01:35<<07:40 | 7.56 audio seconds/s 18% | 748/4200 | 01:44<<08:02 | 7.15 audio seconds/s 19% | 778/4200 | 01:54<<08:22 | 6.81 audio seconds/s 19% | 803/4200 | 02:05<<08:50 | 6.40 audio seconds/s 20% | 825/4200 | 02:23<<09:47 | 5.74 audio seconds/s 20% | 855/4200 | 02:26<<09:31 | 5.85 audio seconds/s 21% | 884/4200 | 02:32<<09:33 | 5.79 audio seconds/s
The text was updated successfully, but these errors were encountered:
那会不会是从视频的24秒开始有很多背景音乐或者吵杂人声?
Sorry, something went wrong.
No branches or pull requests
🤔 问题描述 Problem Description
看一下下面的日志,前面使用cuda,超快的处理了声音,然后到了开始transcription的时候,开始速度很快,然后越来越慢。
MDX Kim_Vocal_2 model probably is running on CUDA: 99% | 276/280 | 01:05<<00:00
MDX Kim_Vocal_2 model probably is running on CUDA: 99% | 277/280 | 01:05<<00:00
MDX Kim_Vocal_2 model probably is running on CUDA: 99% | 278/280 | 01:06<<00:00
MDX Kim_Vocal_2 model probably is running on CUDA: 100% | 279/280 | 01:06<<00:00
MDX Kim_Vocal_2 model probably is running on CUDA: 100% | 280/280 | 01:06<<00:00
MDX Kim_Vocal_2 model probably is running on CUDA: 100% | 280/280 | 01:07<<00:00
Starting transcription on: C:\Users\leo\AppData\Local\Temp\bk_asr\tmppif13f8k\audio.wav
1% | 29/4200 | 00:01<<03:27 | 20.08 audio seconds/s
2% | 69/4200 | 00:02<<02:30 | 27.51 audio seconds/s
2% | 99/4200 | 00:04<<02:46 | 24.57 audio seconds/s
3% | 129/4200 | 00:05<<02:52 | 23.64 audio seconds/s
4% | 162/4200 | 00:06<<02:47 | 24.16 audio seconds/s
5% | 200/4200 | 00:08<<02:41 | 24.79 audio seconds/s
6% | 233/4200 | 00:10<<02:55 | 22.54 audio seconds/s
6% | 265/4200 | 00:11<<02:54 | 22.49 audio seconds/s
7% | 295/4200 | 00:13<<03:02 | 21.40 audio seconds/s
8% | 321/4200 | 00:15<<03:07 | 20.66 audio seconds/s
8% | 353/4200 | 00:24<<04:24 | 14.52 audio seconds/s
9% | 375/4200 | 00:26<<04:25 | 14.39 audio seconds/s
10% | 403/4200 | 00:27<<04:23 | 14.40 audio seconds/s
10% | 433/4200 | 00:29<<04:18 | 14.55 audio seconds/s
11% | 462/4200 | 00:45<<06:08 | 10.15 audio seconds/s
12% | 489/4200 | 00:47<<05:58 | 10.36 audio seconds/s
12% | 521/4200 | 00:58<<06:55 | 8.85 audio seconds/s
13% | 546/4200 | 00:59<<06:40 | 9.13 audio seconds/s
14% | 574/4200 | 01:09<<07:18 | 8.27 audio seconds/s
14% | 605/4200 | 01:18<<07:46 | 7.71 audio seconds/s
15% | 632/4200 | 01:20<<07:32 | 7.89 audio seconds/s
16% | 660/4200 | 01:26<<07:42 | 7.65 audio seconds/s
16% | 692/4200 | 01:27<<07:25 | 7.87 audio seconds/s
17% | 720/4200 | 01:35<<07:40 | 7.56 audio seconds/s
18% | 748/4200 | 01:44<<08:02 | 7.15 audio seconds/s
19% | 778/4200 | 01:54<<08:22 | 6.81 audio seconds/s
19% | 803/4200 | 02:05<<08:50 | 6.40 audio seconds/s
20% | 825/4200 | 02:23<<09:47 | 5.74 audio seconds/s
20% | 855/4200 | 02:26<<09:31 | 5.85 audio seconds/s
21% | 884/4200 | 02:32<<09:33 | 5.79 audio seconds/s
The text was updated successfully, but these errors were encountered: