Sign Language Recognition based on Deep Learning via MediaPipe for people with speech and hearing impairments
Artykuł w czasopiśmie
MNiSW
5
spoza listy
| Status: | |
| Autorzy: | Yerimbetova Aigerim, Berzhanova Ulmeken, Sakenov Bakzhan, Miłosz Marek, Daiyrbayeva Elmira, Bayekeyeva Ainur, Mamyrbayev Orken, Telman Duman |
| Dyscypliny: | |
| Aby zobaczyć szczegóły należy się zalogować. | |
| Rok wydania: | 2026 |
| Wersja dokumentu: | Drukowana | Elektroniczna |
| Język: | angielski |
| Wolumen/Tom: | 275 |
| Strony: | 359 - 368 |
| Efekt badań statutowych | NIE |
| Materiał konferencyjny: | NIE |
| Publikacja OA: | TAK |
| Licencja: | |
| Sposób udostępnienia: | Witryna wydawcy |
| Wersja tekstu: | Ostateczna wersja opublikowana |
| Czas opublikowania: | W momencie opublikowania |
| Data opublikowania w OA: | 20 marca 2026 |
| Abstrakty: | angielski |
| The present paper presents a methodological framework to construct a sign language recognition system that employs a Temporal Convolutional Network (TCN) encoder combined with a Transformer-based decoder. The proposed approach automatically converts gesture video sequences into textual outputs while preserving both temporal dynamics and spatial structure. MediaPipe is utilised to extract 3D coordinates of 225 keypoints from each frame, and these features are pre-processed to facilitate efficient model training. This architecture was experimentally evaluated on the Kazakh Russian sign language (KRSL) corpus and demonstrated its applicability to practical gesture recognition scenarios. This study addresses the core issues of sign language recognition: diversity of sign language users, lack of training data, and lack of pre-learned models for low-resource languages. Overall, the method advances inclusive communication technologies, promoting more accessible interaction for people with speech and hearing impairments and supporting a range of inclusive applications. |
