Full metadata record
DC pole | Hodnota | Jazyk |
---|---|---|
dc.contributor.author | Psutka, Josef | |
dc.contributor.author | Vaněk, Jan | |
dc.contributor.author | Pražák, Aleš | |
dc.date.accessioned | 2021-02-22T11:00:20Z | - |
dc.date.available | 2021-02-22T11:00:20Z | - |
dc.date.issued | 2020 | |
dc.identifier.citation | PSUTKA, J., VANĚK, J., PRAŽÁK, A. Complexity of the TDNN Acoustic Model with Respect to the HMM Topology. In: Text, Speech, and Dialogue 23rd International Conference, TSD 2020, Brno, Czech Republic, September 8-11, 2020, Proceedings. Cham: Springer, 2020. s. 465-473. ISBN 978-3-030-58322-4, ISSN 0302-9743. | cs |
dc.identifier.isbn | 978-3-030-58322-4 | |
dc.identifier.issn | 0302-9743 | |
dc.identifier.uri | 2-s2.0-85091157003 | |
dc.identifier.uri | http://hdl.handle.net/11025/42718 | |
dc.format | 9 s. | cs |
dc.format.mimetype | application/pdf | |
dc.language.iso | en | en |
dc.publisher | Springer | en |
dc.relation.ispartofseries | Text, Speech, and Dialogue 23rd International Conference, TSD 2020, Brno, Czech Republic, September 8-11, 2020, Proceedings | en |
dc.rights | Plný text není přístupný. | cs |
dc.rights | © Springer | en |
dc.title | Complexity of the TDNN Acoustic Model with Respect to the HMM Topology | en |
dc.type | konferenční příspěvek | cs |
dc.type | conferenceObject | en |
dc.rights.access | closedAccess | en |
dc.type.version | publishedVersion | en |
dc.description.abstract-translated | In this paper, we discuss some of the properties of training acoustic models using a lattice-free version of the maximum mutual information criterion (LF-MMI). Currently, the LF-MMI method achieves state-of-the-art results on many speech recognition tasks. Some of the key features of the LF-MMI approach are: training DNN without initialization from a cross-entropy system, the use of a 3-fold reduced frame rate and the use of a simpler HMM topology. The conventional 3-state HMM topology was replaced in a typical LF-MMI training procedure with a special 1-stage HMM topology, that has different pdfs on the self-loop and forward transitions. In this paper, we would like to discuss both the different types of HMM topologies (conventional 1-, 2- and 3-state HMM topology) and the advantages of using biphone context modeling over using the original triphone or a simpler monophone context. We would also like to mention the impact of the subsampling factor to WER. | en |
dc.subject.translated | Speech recognition, Acoustic modeling, HMM topology, Lattice-free MMI | en |
dc.identifier.doi | 10.1007/978-3-030-58323-1_50 | |
dc.type.status | Peer-reviewed | en |
dc.identifier.obd | 43930364 | |
dc.project.ID | LO1506/PUNTIS - Podpora udržitelnosti centra NTIS - Nové technologie pro informační společnost | cs |
Vyskytuje se v kolekcích: | Konferenční příspěvky / Conference papers (NTIS) Konferenční příspěvky / Conference Papers (KKY) OBD |
Soubory připojené k záznamu:
Soubor | Velikost | Formát | |
---|---|---|---|
Psutka2020_Chapter_ComplexityOfTheTDNNAcousticMod.pdf | 260,45 kB | Adobe PDF | Zobrazit/otevřít Vyžádat kopii |
Použijte tento identifikátor k citaci nebo jako odkaz na tento záznam:
http://hdl.handle.net/11025/42718
Všechny záznamy v DSpace jsou chráněny autorskými právy, všechna práva vyhrazena.