Title: Automatic sentence boundary detection in conversational speech: a cross-lingual evaluation on english and czech
Authors: Kolář, Jáchym
Liu, Yang
Citation: KOLÁŘ, Jáchym; LIU, Yang. Automatic sentence boundary detection in conversational speech: a cross-lingual evaluation on english and czech. In: Acoustics, Speech and Signal Processing, 2010. ICASSP ´10, 14-19 March 2010 Dallas, Texas, USA. Beijing: IEEE Press, 2010, p. 5258 - 5261. ISBN 978-1-4244-4296-6.
Issue Date: 2010
Publisher: IEEE Press
Document type: článek
article
URI: http://www.kky.zcu.cz/cs/publications/JachymKolar_2010_AutomaticSentence
http://hdl.handle.net/11025/17174
ISBN: 978-1-4244-4296-6
Keywords: porozumění mluvené řeči;detekce hranice věty;prozodie;mechanické učení
Keywords in different language: spoken language understanding;sentence boundary detection;prosody;machine learning
Abstract in different language: Automatic sentence segmentation of speech is important for enriching speech recognition output and aiding downstream language processing. This paper focuses on automatic sentence segmentation of speech in two different languages -- English and Czech. For this task, we compare and combine three statistical models -- HMM, maximum entropy, and a boosting-based model BoosTexter. All these approaches rely on both textual and prosodic information. We evaluate these methods on a corpus of multiparty meetings in English, and on a corpus of broadcast conversations in Czech, using both manual and speech recognition transcripts. The experiments show that superior results are achieved when all the three models are combined via posterior probability interpolation. We observe differences in terms of model performance between English and Czech, as well as the feature usage difference in prosodic models between the two languages. Overall, the analysis is important for porting sentence segmentation approaches from one language to another.
Rights: © Jáchym Kolář - Yang Liu
Appears in Collections:Články / Articles (KKY)

Files in This Item:
File Description SizeFormat 
JachymKolar_2010_AutomaticSentence.pdfPlný text59,03 kBAdobe PDFView/Open


Please use this identifier to cite or link to this item: http://hdl.handle.net/11025/17174

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.