Full metadata record
DC poleHodnotaJazyk
dc.contributor.authorKolář, Jáchym
dc.contributor.authorLiu, Yang
dc.date.accessioned2016-01-08T06:54:22Z
dc.date.available2016-01-08T06:54:22Z
dc.date.issued2010
dc.identifier.citationKOLÁŘ, Jáchym; LIU, Yang. Automatic sentence boundary detection in conversational speech: a cross-lingual evaluation on english and czech. In: Acoustics, Speech and Signal Processing, 2010. ICASSP ´10, 14-19 March 2010 Dallas, Texas, USA. Beijing: IEEE Press, 2010, p. 5258 - 5261. ISBN 978-1-4244-4296-6.en
dc.identifier.isbn978-1-4244-4296-6
dc.identifier.urihttp://www.kky.zcu.cz/cs/publications/JachymKolar_2010_AutomaticSentence
dc.identifier.urihttp://hdl.handle.net/11025/17174
dc.format4 s.cs
dc.format.mimetypeapplication/pdf
dc.language.isoenen
dc.publisherIEEE Pressen
dc.rights© Jáchym Kolář - Yang Liucs
dc.subjectporozumění mluvené řečics
dc.subjectdetekce hranice větycs
dc.subjectprozodiecs
dc.subjectmechanické učenícs
dc.titleAutomatic sentence boundary detection in conversational speech: a cross-lingual evaluation on english and czechen
dc.typečlánekcs
dc.typearticleen
dc.rights.accessopenAccessen
dc.type.versionpublishedVersionen
dc.description.abstract-translatedAutomatic sentence segmentation of speech is important for enriching speech recognition output and aiding downstream language processing. This paper focuses on automatic sentence segmentation of speech in two different languages -- English and Czech. For this task, we compare and combine three statistical models -- HMM, maximum entropy, and a boosting-based model BoosTexter. All these approaches rely on both textual and prosodic information. We evaluate these methods on a corpus of multiparty meetings in English, and on a corpus of broadcast conversations in Czech, using both manual and speech recognition transcripts. The experiments show that superior results are achieved when all the three models are combined via posterior probability interpolation. We observe differences in terms of model performance between English and Czech, as well as the feature usage difference in prosodic models between the two languages. Overall, the analysis is important for porting sentence segmentation approaches from one language to another.en
dc.subject.translatedspoken language understandingen
dc.subject.translatedsentence boundary detectionen
dc.subject.translatedprosodyen
dc.subject.translatedmachine learningen
dc.type.statusPeer-revieweden
Vyskytuje se v kolekcích:Články / Articles (KKY)

Soubory připojené k záznamu:
Soubor Popis VelikostFormát 
JachymKolar_2010_AutomaticSentence.pdfPlný text59,03 kBAdobe PDFZobrazit/otevřít


Použijte tento identifikátor k citaci nebo jako odkaz na tento záznam: http://hdl.handle.net/11025/17174

Všechny záznamy v DSpace jsou chráněny autorskými právy, všechna práva vyhrazena.