CNN based feature extraction and classification for sign language

  • PDF / 3,659,288 Bytes
  • 19 Pages / 439.37 x 666.142 pts Page_size
  • 73 Downloads / 246 Views

DOWNLOAD

REPORT


CNN based feature extraction and classification for sign language Abul Abbas Barbhuiya 1

1

& Ram Kumar Karsh & Rahul Jain

1

Received: 8 November 2019 / Revised: 4 August 2020 / Accepted: 9 September 2020 # Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract

Hand gesture is one of the most prominent ways of communication since the beginning of the human era. Hand gesture recognition extends human-computer interaction (HCI) more convenient and flexible. Therefore, it is important to identify each character correctly for calm and error-free HCI. Literature survey reveals that most of the existing hand gesture recognition (HGR) systems have considered only a few simple discriminating gestures for recognition performance. This paper applies deep learning-based convolutional neural networks (CNNs) for robust modeling of static signs in the context of sign language recognition. In this work, CNN is employed for HGR where both alphabets and numerals of ASL are considered simultaneously. The pros and cons of CNNs used for HGR are also highlighted. The CNN architecture is based on modified AlexNet and modified VGG16 models for classification. Modified pre-trained AlexNet and modified pre-trained VGG16 based architectures are used for feature extraction followed by a multiclass support vector machine (SVM) classifier. The results are evaluated based on different layer features for best recognition performance. To examine the accuracy of the HGR schemes, both the leave-one-subject-out and a random 70–30 form of cross-validation approach were adopted. This work also highlights the recognition accuracy of each character, and their similarities with identical gestures. The experiments are performed in a simple CPU system instead of high-end GPU systems to demonstrate the cost-effectiveness of this work. The proposed system has achieved a recognition accuracy of 99.82%, which is better than some of the state-of-art methods. Keywords Hand gesture . CNN . American sign language (ASL) . Human–computer interface (HCI)

* Abul Abbas Barbhuiya [email protected]

1

Speech and Image Processing Group, Electronics and Communication Engineering Department, National Institute of Technology, Silchar, Assam 788010, India

Multimedia Tools and Applications

1 Introduction As technology advances, computers have a greater impact on our everyday lives due to the relentless decrease in cost and size [1]. Hand gestures are used in communication since the era of human origin. Hand gesture recognition systems [11] have been widely used in sign language interpretation, medical application, and smart environments. Hand gestures are sensed [27] mainly using three basic types of sensors viz. 1) vision-based sensors, 2) mount based sensors, and 3) multi-touch screen sensors. The vision-based sensors [7, 13, 17] are less clumsy and more relaxed compared to the mounted sensors, as vision-based sensors have no physical interaction [35] with the users. Also, Vision-based sensors offer a much superior working range compared t