WavTTS Model For High-Quality Zero-Shot TTS
A new paper details WavTTS, a model for high-quality zero-shot text-to-speech (TTS) via direct raw waveform modeling. The model operates on Variational Autoencoders (VAEs). This approach aims to improve TTS synthesis capabilities.
Topics
Developing
- 883d Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore.
- 883d Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
- 883d Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est.
- 883d Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium.
Sources · 7 independent
Modernity/arxiv
“WavTTS: Towards High-Quality Zero-Shot TTS via Direct Raw Waveform Modeling. Authors: Wenxi Chen, Dongya Jia, Yushen Chen, Zhikang Niu, Yuzhe Liang, Xiquan Li, Ruiqi Yan, Ziyang Ma, Guanrou Yang, Sanyuan Chen and 4 others Abstract: Recently, diffusion models operating on VAE”
Unlock the full story
Get a Pro subscription or above to see the live story progression and the full list of independent sources confirming each event as they happen.
Log in to upgrade