Hands in Focus: Sign Language Recognition Via Top-Down Attention

Noha Sarhan; Christian Wilms; Vanessa Closius; Ulf Brefeld; Simone Frintrop

doi:10.1109/icip49359.2023.10222729

Hands in Focus: Sign Language Recognition Via Top-Down Attention

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

Authors

Noha Sarhan
Christian Wilms
Vanessa Closius
Ulf Brefeld
Simone Frintrop

Professorship for Information Systems, in particular Machine Learning

In this paper, we propose a novel Sign Language Recognition (SLR) model that leverages the task-specific knowledge to incorporate Top-Down (TD) attention to focus the processing of the network on the most relevant parts of the input video sequence. For SLR, this includes information about the hands' shape, orientation and positions, and motion trajectory. Our model consists of three streams that process RGB, optical flow and TD attention data. For the TD attention, we generate pixel-precise attention maps focusing on both hands, thereby retaining valuable hand information, while eliminating distracting background information. Our proposed method outperforms state-of-the-art on a challenging large-scale dataset by over 2%, and achieves strong results with a much simpler architecture compared to other systems on the newly released AUTSL dataset [1].

Original language	English
Title of host publication	2023 IEEE International Conference on Image Processing, ICIP 2023 - Proceedings : Proceedings
Number of pages	5
Place of Publication	Piscataway
Publisher	IEEE Electromagnetic Compatibility Society
Publication date	08.10.2023
Pages	2555-2559
ISBN (print)	978-1-7281-9836-1
ISBN (electronic)	978-1-7281-9835-4
DOIs	https://doi.org/10.1109/icip49359.2023.10222729
Publication status	Published - 08.10.2023
Event	2023 IEEE International Conference on Image Processing - Kuala Lumpur Convention Centre, Kuala Lumpur, Malaysia Duration: 08.10.2023 → 11.10.2023 Conference number: 30 https://2023.ieeeicip.org/

Bibliographical note

Publisher Copyright:
© 2023 IEEE.

Research areas

Informatics - sign language recognition, top-down attention, deep learning

Other publications by the same author(s)

Interactive sequential generative models for team sports

Fassmeyer, D., Cordes, M. & Brefeld, U., 02.2025, In: Machine Learning. 114, 2, 15 p., 38.

Research output: Journal contributions › Journal articles › Research › peer-review

Joint Item Response Models for Manual and Automatic Scores on Open-Ended Test Items

Bengs, D., Brefeld, U., Kroehne, U. & Zehner, F., 2025, (Accepted/In press) In: Psychometrika.

Research output: Journal contributions › Journal articles › Research › peer-review

Machine Learning and Data Mining for Sports Analytics: 11th International Workshop, MLSA 2024, Vilnius, Lithuania, September 9, 2024, Revised Selected Papers

Brefeld, U. (Editor), Davis, J. (Editor), Van Haaren, J. (Editor) & Zimmermann, A. (Editor), 2025, Cham: Springer Verlag. 119 p. (Communications in Computer and Information Science; vol. 2460)

Research output: Books and anthologies › Conference proceedings › Research

Masked autoencoder for multiagent trajectories

Rudolph, Y. & Brefeld, U., 02.2025, In: Machine Learning. 114, 2, 18 p., 44.

Research output: Journal contributions › Journal articles › Research › peer-review

Self-improvement for Computerized Adaptive Testing

Rudolph, Y., Neubauer, K. & Brefeld, U., 2026, Machine Learning and Knowledge Discovery in Databases - Research Track: European Conference, ECML PKDD 2025, Porto, Portugal, September 15–19, 2025, Proceedings. Ribeiro, R. P., Jorge, A. M., Soares, C., Gama, J., Pfahringer, B., Japkowicz, N., Larrañaga, P. & Abreu, P. H. (eds.). Cham: Springer International Publishing, Vol. 2. p. 70-86 17 p. (Lecture Notes in Computer Science; vol. 16014 LNCS).

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

DOI

https://doi.org/10.1109/icip49359.2023.10222729
Final published version