Czech Speech Synthesis with Generative Neural Vocoder

Vít, Jakub; Hanzlíček, Zdeněk; Matoušek, Jindřich

Full metadata record

DC pole	Hodnota	Jazyk
dc.contributor.author	Vít, Jakub
dc.contributor.author	Hanzlíček, Zdeněk
dc.contributor.author	Matoušek, Jindřich
dc.date.accessioned	2020-03-23T11:00:23Z	-
dc.date.available	2020-03-23T11:00:23Z	-
dc.date.issued	2019
dc.identifier.citation	VÍT, J., HANZLÍČEK, Z., MATOUŠEK, J. Czech Speech Synthesis with Generative Neural Vocoder. In: Text, Speech, and Dialogue 22nd International Conference, TSD 2019, Ljubljana,Slovenia, September 11-13, 2019, Proceedings. Cham: Springer, 2019. s. 307-315. ISBN 978-3-030-27946-2 , ISSN 0302-9743.	en
dc.identifier.isbn	978-3-030-27946-2
dc.identifier.issn	0302-9743
dc.identifier.uri	2-s2.0-85072849542
dc.identifier.uri	http://hdl.handle.net/11025/36715
dc.description.abstract	In recent years, new neural architectures for generating high-quality synthetic speech on a per-sample basis were introduced. We describe our application of statistical parametric speech synthesis based on LSTM neural networks combined with a generative neural vocoder for the Czech language. We used a traditional LSTM architecture for generating vocoder parametrization from linguistic features. We replaced a standard vocoder with a WaveRNN neural network. We conducted a MUSHRA listening test to compare the proposed approach with the unit selection and LSTM-based parametric speech synthesis utilizing a standard vocoder. In contrast with our previous work, we managed to outperform a well-tuned unit selection TTS system by a great margin on both professional and amateur voices.	en
dc.format	9 s.	cs
dc.format.mimetype	application/pdf
dc.language.iso	en	en
dc.publisher	Springer	en
dc.relation.ispartofseries	Text, Speech, and Dialogue 22nd International Conference, TSD 2019, Ljubljana,Slovenia, September 11-13, 2019, Proceedings	en
dc.rights	Plný text není přístupný.	cs
dc.rights	© Springer	en
dc.title	Czech Speech Synthesis with Generative Neural Vocoder	en
dc.type	konferenční příspěvek	cs
dc.type	conferenceObject	en
dc.rights.access	closedAccess	en
dc.type.version	publishedVersion	en
dc.subject.translated	Speech synthesis, LSTM-based speech synthesis, WaveRNN, Neural vocoder, Unit selection	en
dc.identifier.doi	10.1007/978-3-030-27947-9_26
dc.type.status	Peer-reviewed	en
dc.identifier.obd	43926904
dc.project.ID	SGS-2019-027/Inteligentní metody strojového vnímání a porozumění 4	cs
dc.project.ID	GA19-19324S/Plně trénovatelná syntéza české řeči z textu s využitím hlubokých neuronových sítí	cs
Vyskytuje se v kolekcích:	Konferenční příspěvky / Conference papers (NTIS) Konferenční příspěvky / Conference Papers (KKY) OBD

Soubory připojené k záznamu:

Soubor	Velikost	Formát
Vit2019_Chapter_CzechSpeechSynthesisWithGenera.pdf	401,08 kB	Adobe PDF	Zobrazit/otevřít Vyžádat kopii

Zobrazit minimální záznam Zobrazit statistiky

Použijte tento identifikátor k citaci nebo jako odkaz na tento záznam: http://hdl.handle.net/11025/36715

Všechny záznamy v DSpace jsou chráněny autorskými právy, všechna práva vyhrazena.

hledání

navigace