A comprehensive review of deep learning approaches for video-based sign language recognition: datasets, challenges and insights
Artykuł w czasopiśmie
MNiSW
20
Lista 2024
| Status: | |
| Autorzy: | Berzhanova Ulmeken, Yerimbetova Aigerim, Miłosz Marek, Sakenov Bakzhan, Oralbekova Dina, Daiyrbayeva Elmira, Turgan Daniyar |
| Dyscypliny: | |
| Aby zobaczyć szczegóły należy się zalogować. | |
| Rok wydania: | 2026 |
| Wersja dokumentu: | Elektroniczna |
| Język: | angielski |
| Numer czasopisma: | 6 |
| Wolumen/Tom: | 10 |
| Numer artykułu: | 58 |
| Strony: | 1 - 36 |
| Impact Factor: | 2,4 |
| Web of Science® Times Cited: | 0 |
| Scopus® Cytowania: | 0 |
| Bazy: | Web of Science | Scopus |
| Efekt badań statutowych | NIE |
| Finansowanie: | This research has been funded by the Committee of Science of the Ministry of Science and Higher Education of the Republic of Kazakhstan (Grant No. BR24992875). |
| Materiał konferencyjny: | NIE |
| Publikacja OA: | TAK |
| Licencja: | |
| Sposób udostępnienia: | Witryna wydawcy |
| Wersja tekstu: | Ostateczna wersja opublikowana |
| Czas opublikowania: | W momencie opublikowania |
| Data opublikowania w OA: | 22 maja 2026 |
| Abstrakty: | angielski |
| This study presents a comprehensive review of more than 100 research papers on sign language recognition (SLR) published between 2020 and 2026. The analysis focuses on deep learning approaches applied to video-based SLR, including spatiotemporal feature extraction, temporal modeling, attention mechanisms, motion-based representations, hy- brid frameworks, transfer learning methods and other methods. Particular attention is given to how these methods model spatiotemporal dynamics and capture subtle gesture characteristics in sign language communication. The review highlights several recent developments, such as the introduction of specialized datasets, the emergence of real-time recognition systems, and the integration of multimodal fusion strategies. At the same time, persistent challenges remain, including data scarcity in low-resource sign languages, limited linguistic standardization of datasets, and insufficient model interpretability. The findings underline the importance of developing scalable and generalizable models ca- pable of handling diverse datasets and user variability. The distinct contributions of this review are fourfold: (1) a comprehensive synthesis of over 100 studies published between 2020 and 2026, covering the full spectrum of deep learning architectures for video-based SLR; (2) a structured six-category taxonomy enabling systematic cross-architectural com- parison; (3) a comprehensive focus on low-resource sign languages, which remain under- represented in the existing literature; and (4) a critical analysis of the current benchmark landscape for low-resource sign languages, identifying key gaps and outlining strategic di- rections for future dataset development. These contributions are intended to guide further research toward more robust, inclusive, and universally applicable SLR systems. |
