Online speaker adaptation of an acoustic model using face recognition

Campr, Pavel; Pražák, Aleš; Psutka, Josef V.; Psutka, Josef

Full metadata record

DC pole	Hodnota	Jazyk
dc.contributor.author	Campr, Pavel
dc.contributor.author	Pražák, Aleš
dc.contributor.author	Psutka, Josef V.
dc.contributor.author	Psutka, Josef
dc.date.accessioned	2016-01-11T05:33:20Z
dc.date.available	2016-01-11T05:33:20Z
dc.date.issued	2013
dc.identifier.citation	CAMPR, Pavel; PRAŽÁK, Aleš; PSUTKA, Josef V.; PSUTKA, Josef. Online speaker adaptation of an acoustic model using face recognition. In: Text, speech and dialogue. Berlin: Springer, 2013, p. 378-385. (Lectures notes in computer science; 8082). ISBN 978-3-642-40584-6.	en
dc.identifier.isbn	978-3-642-40584-6
dc.identifier.uri	http://www.kky.zcu.cz/cs/publications/CamprPavel_2013_OnlineSpeaker
dc.identifier.uri	http://hdl.handle.net/11025/17203
dc.format	8 s.	cs
dc.format.mimetype	application/pdf
dc.language.iso	en	en
dc.publisher	Springer	en
dc.relation.ispartofseries	Lecture notes in computer science; 8082	en
dc.rights	© Pavel Campr - Aleš Pražák - Josef V. Psutka - Josef Psutka	cs
dc.subject	akustický model	cs
dc.subject	adaptace na řečníka	cs
dc.subject	rozpoznávání obličeje	cs
dc.subject	multimodální zpracování	cs
dc.subject	automatické rozpoznávání řeči	cs
dc.title	Online speaker adaptation of an acoustic model using face recognition	en
dc.title.alternative	Online adaptace akustického modelu na řečníka s využitím systému pro rozpoznávání obličejů	cs
dc.type	článek	cs
dc.type	article	en
dc.rights.access	openAccess	en
dc.type.version	publishedVersion	en
dc.description.abstract-translated	We have proposed and evaluated a novel approach for online speaker adaptation of an acoustic model based on face recognition. Instead of traditionally used audio-based speaker identification we investigated the video modality for the task of speaker detection. A simulated on-line transcription created by a Large-Vocabulary Continuous Speech Recognition (LVCSR) system for online subtitling is evaluated utilizing speaker independent acoustic models, gender dependent models and models of particular speakers. In the experiment, the speaker dependent acoustic models were trained offline, and are switched online based on the decision of a face recognizer, which reducedWord Error Rate (WER) by 12% relatively compared to speaker independent baseline system.	en
dc.subject.translated	acoustic model	en
dc.subject.translated	speaker adaptation	en
dc.subject.translated	face recognition	en
dc.subject.translated	multimodal processing	en
dc.subject.translated	automatic speech recognition	en
dc.identifier.doi	10.1007/978-3-642-40585-3_48
dc.identifier.doi	10.1007/978-3-642-40585-3_48
dc.type.status	Peer-reviewed	en
Vyskytuje se v kolekcích:	Články / Articles (NTIS) Články / Articles (KKY)

Soubory připojené k záznamu:

Soubor	Popis	Velikost	Formát
CamprPavel_2013_OnlineSpeaker.pdf	Plný text	264,95 kB	Adobe PDF	Zobrazit/otevřít

Zobrazit minimální záznam Zobrazit statistiky

Použijte tento identifikátor k citaci nebo jako odkaz na tento záznam: http://hdl.handle.net/11025/17203

Všechny záznamy v DSpace jsou chráněny autorskými právy, všechna práva vyhrazena.

hledání

navigace