An Efficient Attention Model with Critical Frames Identification for Sign Language Recognition (CRAM-SLR)

Renjith Sasidharan; Aneesh Varghese; Manazhy Rashmi; Poorna S. Surendran

doi:10.47852/bonviewJCCE52025288

Authors

Renjith Sasidharan Department of Computer Science and Engineering, Amrita Vishwa Vidyapeetham-Amritapuri, India https://orcid.org/0009-0005-2803-2703
Aneesh Varghese Amazon Web Services, USA
Manazhy Rashmi Department of Electronics and Communication Engineering, Amrita Vishwa Vidyapeetham-Amritapuri, India
Poorna S. Surendran Department of Electronics and Communication Engineering, Amrita Vishwa Vidyapeetham-Amritapuri, India https://orcid.org/0000-0003-3876-1254

DOI:

https://doi.org/10.47852/bonviewJCCE52025288

Keywords:

sign language, classification, hybrid CNN-BiLSTM, convolutional recurrent attention

Abstract

Sign language recognition (SLR) plays a crucial role in enhancing communication accessibility for individuals who are deaf or hard of hearing. This paper introduces the convolutional recurrent attention model (CRAM), a novel deep learning framework specifically designed to improve recognition performance in low-resource sign languages such as Indian Sign Language (ISL) and Arabic Sign Language (ArSL). CRAM features a Critical Frames Identification algorithm that leverages the histogram of oriented gradients descriptor to extract the most informative key frames from sign videos, thereby reducing computational overhead while retaining essential gesture information. The model architecture combines convolutional layers to extract rich spatial features, bidirectional long short-term memory networks for effective temporal sequence modeling, and an attention mechanism to dynamically prioritize crucial frames. This integration enables CRAM to capture complex spatial-temporal dependencies inherent in sign gestures. Extensive experiments conducted on ISL and ArSL datasets validate the model's effectiveness, with CRAM achieving state-of-the-art accuracy, precision, and recall. The results highlight CRAM's potential in advancing robust and inclusive SLR solutions for underrepresented sign languages, promoting more effective gesture-based human-computer interaction.

Received: 23 January 2025 | Revised: 13 May 2025 | Accepted: 7 June 2025

Conflicts of Interest

The authors declare that they have no conflicts of interest to this work.

Data Availability Statement

Data are available on request from the corresponding author upon reasonable request.

Author Contribution Statement

Renjith Sasidharan: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Resources, Data curation, Writing – original draft. Aneesh Varghese: Software, Resources, Visualization. Manazhy Rashmi: Writing – review & editing, Supervision, Project administration. Poorna S. Surendran: Writing – review & editing, Project administration.