Název: Semi-Supervised Learning Approach for Fine Grained Human Hand Action Recognition in Industrial Assembly
Autoři: Sturm, Fabian
Sathiyababu, Rahul
Hergenroether, Elke
Siegel, Melanie
Citace zdrojového dokumentu: WSCG 2023: full papers proceedings: 1. International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision, p. 340-350.
Datum vydání: 2023
Nakladatel: Václav Skala - UNION Agency
Typ dokumentu: konferenční příspěvek
conferenceObject
URI: http://hdl.handle.net/11025/54442
ISBN: 978-80-86943-32-9
ISSN: 2464–4617 (print)
2464–4625 (CD/DVD)
Klíčová slova: rozpoznávání lidského jednání;průmyslová montáž;polořízené učení;přenos učení;transformátor
Klíčová slova v dalším jazyce: human action recognition;industrial assembly;semi-supervised learning;transfer learning;transformer
Abstrakt v dalším jazyce: Until now, it has been impossible to imagine industrial manual assembly without humans due to their flexibility and adaptability. But the assembly process does not always benefit from human intervention. The error-proneness of the assembler due to disturbance, distraction or inattention requires intelligent support of the employee and is ideally suited for deep learning approaches because of the permanently occurring and repetitive data patterns. However, there is the problem that the labels of the data are not always sufficiently available. In this work, a spatio-temporal transformer model approach is used to address the circumstances of few labels in an industrial setting. A pseudo-labeling method from the field of semi-supervised transfer learning is applied for model training, and the entire architecture is adapted to the fine-grained recognition of human hand actions in assembly. This implementation significantly improves the generalization of the model during the training process over different variations of strong and weak classes from the ground truth and proves that it is possible to work with deep learning technologies in an industrial setting, even with few labels. In addition to the main goal of improving the generalization capabilities of the model by using less data during training and exploring different variations of appropriate ground truth and new classes, the recognition capabilities of the model are improved by adding convolution to the temporal embedding layer, which increases the test accuracy by over 5% compared to a similar predecessor model.
Práva: © Václav Skala - UNION Agency
Vyskytuje se v kolekcích:WSCG 2023: Full Papers Proceedings

Soubory připojené k záznamu:
Soubor Popis VelikostFormát 
F89-full.pdfPlný text7,54 MBAdobe PDFZobrazit/otevřít


Použijte tento identifikátor k citaci nebo jako odkaz na tento záznam: http://hdl.handle.net/11025/54442

Všechny záznamy v DSpace jsou chráněny autorskými právy, všechna práva vyhrazena.