One Model is Not Enough: Ensembles for Isolated Sign Language Recognition

Hrúz, Marek; Gruber, Ivan; Kanis, Jakub; Boháček, Matyáš; Hlaváč, Miroslav; Krňoul, Zdeněk

Title:	One Model is Not Enough: Ensembles for Isolated Sign Language Recognition
Authors:	Hrúz, Marek Gruber, Ivan Kanis, Jakub Boháček, Matyáš Hlaváč, Miroslav Krňoul, Zdeněk
Citation:	HRÚZ, M. GRUBER, I. KANIS, J. BOHÁČEK, M. HLAVÁČ, M. KRŇOUL, Z. One Model is Not Enough: Ensembles for Isolated Sign Language Recognition. SENSORS, 2022, roč. 22, č. 13, s. nestránkováno. ISSN: 1424-8220
Issue Date:	2022
Publisher:	MDPI
Document type:	článek article
URI:	2-s2.0-85133217387 http://hdl.handle.net/11025/51652
ISSN:	1424-8220
Keywords in different language:	sign language recognition;CNN;Transformer;ensemble
Abstract in different language:	In this paper, we dive into sign language recognition, focusing on the recognition of isolated signs. The task is defined as a classification problem, where a sequence of frames (i.e., images) is recognized as one of the given sign language glosses. We analyze two appearance-based approaches, I3D and TimeSformer, and one pose-based approach, SPOTER. The appearance-based approaches are trained on a few different data modalities, whereas the performance of SPOTER is evaluated on different types of preprocessing. All the methods are tested on two publicly available datasets: AUTSL and WLASL300. We experiment with ensemble techniques to achieve new state-of-the-art results of 73.84% accuracy on the WLASL300 dataset by using the CMA-ES optimization method to find the best ensemble weight parameters. Furthermore, we present an ensembling technique based on the Transformer model, which we call Neural Ensembler.
Rights:	© authors
Appears in Collections:	Články / Articles (NTIS) Články / Articles (KKY) OBD

Files in This Item:

File	Size	Format
sensors-22-05043-v3-2.pdf	1,12 MB	Adobe PDF	View/Open

Show full item record

Please use this identifier to cite or link to this item: http://hdl.handle.net/11025/51652

search

navigation