Automatic Correction of i/y Spelling in Czech ASR Output

Švec, Jan; Lehečka, Jan; Šmídl, Luboš; Ircing, Pavel

Full metadata record

DC pole	Hodnota	Jazyk
dc.contributor.author	Švec, Jan
dc.contributor.author	Lehečka, Jan
dc.contributor.author	Šmídl, Luboš
dc.contributor.author	Ircing, Pavel
dc.date.accessioned	2021-03-29T10:00:17Z	-
dc.date.available	2021-03-29T10:00:17Z	-
dc.date.issued	2020
dc.identifier.citation	ŠVEC, J. LEHEČKA, J. ŠMÍDL, L. IRCING, P. Automatic Correction of i/y Spelling in Czech ASR Output. In: Text, Speech, and Dialogue 23rd International Conference, TSD 2020, Brno, Czech Republic, September 8-11, 2020, Proceedings. Cham: Springer, 2020. s. 321-330. ISBN 978-3-030-58322-4, ISSN 0302-9743.	cs
dc.identifier.isbn	978-3-030-58322-4
dc.identifier.issn	0302-9743
dc.identifier.uri	2-s2.0-85091182120
dc.identifier.uri	http://hdl.handle.net/11025/43118
dc.format	10 s.	cs
dc.format.mimetype	application/pdf
dc.language.iso	en	en
dc.publisher	Springer	en
dc.relation.ispartofseries	Text, Speech, and Dialogue 23rd International Conference, TSD 2020, Brno, Czech Republic, September 8-11, 2020, Proceedings	en
dc.rights	Plný text není přístupný.	cs
dc.rights	© Springer	en
dc.title	Automatic Correction of i/y Spelling in Czech ASR Output	en
dc.type	konferenční příspěvek	cs
dc.type	conferenceObject	en
dc.rights.access	closedAccess	en
dc.type.version	publishedVersion	en
dc.description.abstract-translated	This paper concentrates on the design and evaluation of the method that would be able to automatically correct the spelling of i/y in the Czech words at the output of the ASR decoder. After analysis of both the Czech grammar rules and the data, we have decided to deal only with the endings consisting of consonants b/f/l/m/p/s/v/z followed by i/y in both short and long forms. The correction is framed as the classification task where the word could belong to the “i” class, the “y” class or the “empty” class. Using the state-of-the-art Bidirectional Encoder Representations from Transformers (BERT) architecture, we were able to substantially improve the correctness of the i/y spelling both on the simulated and the real ASR output. Since the misspelling of i/y in the Czech texts is seen by the majority of native Czech speakers as a blatant error, the corrected output greatly improves the perceived quality of the ASR system.	en
dc.subject.translated	Grammatical error correction, ASR , BERT	en
dc.identifier.doi	10.1007/978-3-030-58323-1_35
dc.type.status	Peer-reviewed	en
dc.identifier.obd	43930359
dc.project.ID	TN01000024/Národní centrum kompetence - Kybernetika a umělá inteligence	cs
dc.project.ID	LM2018140/E-infrastruktura CZ	cs
Vyskytuje se v kolekcích:	Konferenční příspěvky / Conference papers (NTIS) Konferenční příspěvky / Conference Papers (KKY) OBD

Soubory připojené k záznamu:

Soubor	Velikost	Formát
Švec2020_Chapter_AutomaticCorrectionOfIYSpellin.pdf	251,46 kB	Adobe PDF	Zobrazit/otevřít Vyžádat kopii

Zobrazit minimální záznam Zobrazit statistiky

Použijte tento identifikátor k citaci nebo jako odkaz na tento záznam: http://hdl.handle.net/11025/43118

Všechny záznamy v DSpace jsou chráněny autorskými právy, všechna práva vyhrazena.

hledání

navigace