Title: One Model is Not Enough: Ensembles for Isolated Sign Language Recognition
Authors: Hrúz, Marek
Gruber, Ivan
Kanis, Jakub
Boháček, Matyáš
Hlaváč, Miroslav
Krňoul, Zdeněk
Citation: HRÚZ, M. GRUBER, I. KANIS, J. BOHÁČEK, M. HLAVÁČ, M. KRŇOUL, Z. One Model is Not Enough: Ensembles for Isolated Sign Language Recognition. SENSORS, 2022, roč. 22, č. 13, s. nestránkováno. ISSN: 1424-8220
Issue Date: 2022
Publisher: MDPI
Document type: článek
article
URI: 2-s2.0-85133217387
http://hdl.handle.net/11025/51652
ISSN: 1424-8220
Keywords in different language: sign language recognition;CNN;Transformer;ensemble
Abstract in different language: In this paper, we dive into sign language recognition, focusing on the recognition of isolated signs. The task is defined as a classification problem, where a sequence of frames (i.e., images) is recognized as one of the given sign language glosses. We analyze two appearance-based approaches, I3D and TimeSformer, and one pose-based approach, SPOTER. The appearance-based approaches are trained on a few different data modalities, whereas the performance of SPOTER is evaluated on different types of preprocessing. All the methods are tested on two publicly available datasets: AUTSL and WLASL300. We experiment with ensemble techniques to achieve new state-of-the-art results of 73.84% accuracy on the WLASL300 dataset by using the CMA-ES optimization method to find the best ensemble weight parameters. Furthermore, we present an ensembling technique based on the Transformer model, which we call Neural Ensembler.
Rights: © authors
Appears in Collections:Články / Articles (NTIS)
Články / Articles (KKY)
OBD

Files in This Item:
File SizeFormat 
sensors-22-05043-v3-2.pdf1,12 MBAdobe PDFView/Open


Please use this identifier to cite or link to this item: http://hdl.handle.net/11025/51652

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

search
navigation
  1. DSpace at University of West Bohemia
  2. Publikační činnost / Publications
  3. OBD