UPM Institutional Repository

Creating a parallel corpus for the Kazakh sign language and learning


Citation

Yerimbetova, Aigerim and Sakenov, Bakzhan and Sambetbayeva, Madina and Daiyrbayeva, Elmira and Berzhanova, Ulmeken and Othman, Mohamed (2025) Creating a parallel corpus for the Kazakh sign language and learning. Applied Sciences, 15 (5). art. no. 2808. pp. 1-20. ISSN 2076-3417

Abstract

Kazakh Sign Language (KSL) is a crucial communication tool for individuals with hearing and speech impairments. Deep learning, particularly Transformer models, offers a promising approach to improving accessibility in education and communication. This study analyzes the syntactic structure of KSL, identifying its unique grammatical features and deviations from spoken Kazakh. A custom parser was developed to convert Kazakh text into KSL glosses, enabling the creation of a large-scale parallel corpus. Using this resource, a Transformer-based machine translation model was trained, achieving high translation accuracy and demonstrating the feasibility of this approach for enhancing communication accessibility. The research highlights key challenges in sign language processing, such as the limited availability of annotated data. Future work directions include the integration of video data and the adoption of more comprehensive evaluation metrics. This paper presents a methodology for constructing a parallel corpus through gloss annotations, contributing to advancements in sign language translation technology.


Download File

[img] Text
121854.pdf - Published Version
Available under License Creative Commons Attribution.

Download (2MB)
Official URL or Download Paper: https://www.mdpi.com/2076-3417/15/5/2808

Additional Metadata

Item Type: Article
Divisions: Faculty of Computer Science and Information Technology
DOI Number: https://doi.org/10.3390/app15052808
Publisher: Multidisciplinary Digital Publishing Institute
Keywords: Deep learning; Kazakh sign language; Machine translation; Parallel corpus; Sequence to sequence model; Sign language
Depositing User: Ms. Nur Faseha Mohd Kadim
Date Deposited: 19 Nov 2025 09:45
Last Modified: 19 Nov 2025 09:45
Altmetrics: http://www.altmetric.com/details.php?domain=psasir.upm.edu.my&doi=10.3390/app15052808
URI: http://psasir.upm.edu.my/id/eprint/121854
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item