New model training #6

lucasjinreal · 2025-01-15T05:07:24Z

Kokoro actually didn't opensource their training code yet.
However, am currently crafted on it's training procedure. I will also implement multi-speaker TTS with style, further more, I'd like support Voice Cloning feature as well.

If you also interested in reveal the training of Kokoro, please leave a comment below, let's make a contact to see how much further we can push it into.

JMLLR1 · 2025-01-15T19:50:26Z

I am interested in creating a training script as well. Feel free to dm.

lucasjinreal · 2025-01-16T01:23:23Z

@JMLLR1 Pls get on boat: https://discord.gg/6Cryfdvb we are discussing how to start it with same way like Raven does (kokoro author) training with fixed styles. Pin me (airflow)

lucasjinreal · 2025-01-17T04:18:06Z

Guys, training on expresso is on-going now:

Epoch [7/50], Step [10/724], Mel Loss: 0.72029, Gen Loss: 28.53927, Disc Loss: 1.39727, Mono Loss: 0.04416, S2S Loss: 1.00780, SLM Loss: 2.52246
Time elasped: 18.641412496566772
Epoch [7/50], Step [20/724], Mel Loss: 0.72382, Gen Loss: 28.51638, Disc Loss: 1.39997, Mono Loss: 0.04016, S2S Loss: 1.05560, SLM Loss: 2.53120
Time elasped: 36.09541082382202
Epoch [7/50], Step [30/724], Mel Loss: 0.70927, Gen Loss: 28.73309, Disc Loss: 1.38668, Mono Loss: 0.07664, S2S Loss: 1.07032, SLM Loss: 2.59169
Time elasped: 53.391515016555786
Epoch [7/50], Step [40/724], Mel Loss: 0.69908, Gen Loss: 28.99346, Disc Loss: 1.37871, Mono Loss: 0.06705, S2S Loss: 1.01630, SLM Loss: 2.52710
Time elasped: 71.93912291526794
Epoch [7/50], Step [50/724], Mel Loss: 0.71041, Gen Loss: 28.97557, Disc Loss: 1.38679, Mono Loss: 0.05649, S2S Loss: 0.69880, SLM Loss: 2.42558
Time elasped: 89.9345874786377
Epoch [7/50], Step [60/724], Mel Loss: 0.70334, Gen Loss: 29.16667, Disc Loss: 1.38239, Mono Loss: 0.02717, S2S Loss: 0.80091, SLM Loss: 2.44010
Time elasped: 108.03602576255798
Epoch [7/50], Step [70/724], Mel Loss: 0.70238, Gen Loss: 29.45588, Disc Loss: 1.37100, Mono Loss: 0.07006, S2S Loss: 0.91138, SLM Loss: 2.45986
Time elasped: 125.50399613380432
Epoch [7/50], Step [80/724], Mel Loss: 0.69871, Gen Loss: 29.58579, Disc Loss: 1.36915, Mono Loss: 0.06034, S2S Loss: 0.95874, SLM Loss: 2.42004
Time elasped: 144.0021252632141
ɪt sˈiːmd ɐn ˈeɪdʒ, ænd ðə deɪnˈuːmɔ̃ jˈɛt ɐnˈʌðɚɹ ˈeɪdʒ dᵻfˈɜːd.
Epoch [7/50], Step [90/724], Mel Loss: 0.69936, Gen Loss: 29.68557, Disc Loss: 1.36117, Mono Loss: 0.08106, S2S Loss: 0.87225, SLM Loss: 2.47548
Time elasped: 162.62565422058105

Waiting for my updates

JMLLR1 · 2025-01-17T19:53:05Z

Very nice, seems I have been too slow.. unfortunately I am unable to join the discord until Sunday but I will join. What dataset are you training on right now?

lucasjinreal · 2025-01-18T09:22:49Z

Currently test training with expresso. I had finished 24 / 30 in second training, works normal.

Due to limitness scale of expresso dataset, the result only learns how to speak. Next step would be, enlarge the dataset and training with multilangual.

If anyone interested, please contribute dataset or thoughts on improve model ability.

One side of StyleTTS2 good compare with other TTS AR based, it's the relatively small size and effectiveness.

I think we can push StyleTTS2 further if multilingual and large scale supported.

Marootc · 2025-01-23T11:41:51Z

Hello, friend, how is the result of the training model? What is the result of the training of a single person speaking? Can the character's voice be consistent? Kokoro's whisper is great. I want to know how it was made.

lucasjinreal pinned this issue Jan 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New model training #6

New model training #6

lucasjinreal commented Jan 15, 2025

JMLLR1 commented Jan 15, 2025

lucasjinreal commented Jan 16, 2025 •

edited

Loading

lucasjinreal commented Jan 17, 2025

JMLLR1 commented Jan 17, 2025

lucasjinreal commented Jan 18, 2025

Marootc commented Jan 23, 2025

New model training #6

New model training #6

Comments

lucasjinreal commented Jan 15, 2025

JMLLR1 commented Jan 15, 2025

lucasjinreal commented Jan 16, 2025 • edited Loading

lucasjinreal commented Jan 17, 2025

JMLLR1 commented Jan 17, 2025

lucasjinreal commented Jan 18, 2025

Marootc commented Jan 23, 2025

lucasjinreal commented Jan 16, 2025 •

edited

Loading