audio duration changed after analysis-synthesis #5

wincing2 · 2024-10-16T07:02:23Z

The result of analysis-synthesis is a longer speech audio. Is there something wrong here? The code prepend a 0.5s silence before the analysis, but the resulting audio is NOT 0.5s longer than the source audio. For example, this file is 7.34s in duration, but the systhesized one is 8.04s.

There is an error message, which I'm not sure if it has something to the change in duration:

Error(s) in loading state_dict for StagedVQVAE:
Unexpected key(s) in state_dict: "mel_spectrogram.mel_stft.mel_scale.fb", "mel_spectrogram.mel_stft.spectrogram.window"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

audio duration changed after analysis-synthesis #5

audio duration changed after analysis-synthesis #5

wincing2 commented Oct 16, 2024 •

edited

Loading

audio duration changed after analysis-synthesis #5

audio duration changed after analysis-synthesis #5

Comments

wincing2 commented Oct 16, 2024 • edited Loading

wincing2 commented Oct 16, 2024 •

edited

Loading