Auslan Sign Language Image Recognition Using Deep Neural Network

Shahnaz Aqsa Qambrani; Faiza Ahmed Dahri; Shabana Bhatti; Santosh Kumar Banbhrani

doi:10.70670/sra.v3i3.1008

Authors

Shahnaz Aqsa Qambrani Department of Information and Computing, Faculty of Science and Technology, University of Sufism and Modern Sciences, Bhitshah Sindh Pakistan, Email: aqsa999qambrani@gmail.com
Faiza Ahmed Dahri Department of Information and Computing, Faculty of Science and Technology, University of Sufism and Modern Sciences, Bhitshah Sindh Pakistan, Email: faizadahri8@gmail.com
Shabana Bhatti Department of Information and Computing, Faculty of Science and Technology, University of Sufism and Modern Sciences, Bhitshah Sindh Pakistan, Email: shabobhatti9@gmail.com
Santosh Kumar Banbhrani Department of Information and Computing, Faculty of Science and Technology, University of Sufism and Modern Sciences, Bhitshah Sindh Pakistan, Email: banbhrani@gmail.com

DOI:

https://doi.org/10.70670/sra.v3i3.1008

Keywords:

Sign Language Recognition; Auslan; Convolutional Neural Network; Random Forest; Support Vector Machine (SVM); Canny Edge Detection; Media Pipe; Real-Time Recognition.

Abstract

Sign language recognition improves accessibility for the deaf and hard-of-hearing by translating hand gestures into machine-interpretable labels. This paper presents a hybrid pipeline for static Auslan digit recognition (classes 0–2) that combines convolutional neural networks (CNNs) for automated feature extraction with classical classifiers, Support Vector Machine (SVM) and Random Forest (RF). A grayscale dataset of 6,000 images (2,000 per class) was pre-processed using Canny edge detection to emphasize contour information, then resized for model inputs. Two CNN feature-extractors were trained and their flattened feature vectors fed to an RBF-kernel SVM and a 100-tree Random Forest. Experimental evaluation shows the CNN + Random Forest hybrid attained the highest validation accuracy (99.75%), outperforming the baseline end-to-end CNN (≈95%) and the CNN+SVM (99.67%). The trained pipeline was also integrated into a Mediapipe-based real-time testing setup to demonstrate practical applicability. Results indicate that combining deep feature extraction with ensemble/classical classifiers improves robustness and generalization for static gesture recognition. Future work will expand class coverage, incorporate dynamic gesture modelling, and investigate model compression for embedded deployment.

Auslan Sign Language Image Recognition Using Deep Neural Network

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Make a Submission

JOURNAL INFORMATION

Current Issue

Information

visitors

HEC Recognized Journal