Joint space representation and recognition of sign language fingerspelling using Gabor filter and convolutional neural n

  • PDF / 5,735,729 Bytes
  • 22 Pages / 439.642 x 666.49 pts Page_size
  • 38 Downloads / 165 Views

DOWNLOAD

REPORT


Joint space representation and recognition of sign language fingerspelling using Gabor filter and convolutional neural network Hamzah Luqman1 · El-Sayed M. El-Alfy1

· Galal M. BinMakhashen1

Received: 15 November 2019 / Revised: 23 September 2020 / Accepted: 29 September 2020 / © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract In this work, we are proposing a new technique for visual recognition of fingerspelling of a sign language by fusing multiple spatial and spectral representations of manual gesture images using a convolutional neural network. This problem is gaining prominence in communication between hearing-impaired people and human-machine interaction. The proposed technique computes Gabor spectral representations of spatial images of hand sign gestures and uses an optimized convolutional neural network to classify the gestures in the joint space into corresponding classes. Various ways to combine both types of modalities are explored to identify the model that improves the robustness and recognition accuracy. The proposed system is evaluated using three databases (MNIST-ASL, ArSL, and MUASL) under different conditions and the attained results outperformed the state-of-the-art techniques. Keywords Human-machine interaction · Sign language · Hand gesture · Gabor filter · Deep learning · Multimodal recognition systems

1 Introduction Sign language facilitates communication with people who are deaf or having auditory impairments, ranging from mild to severe. According to the World Health Organization (WHO)1 , around 466 million individuals worldwide, i.e. about 6% of the world’s population, have disabling hearing loss; this number is estimated to double by 2050. Hearing-impaired people depend greatly on sign language for daily activities and engagement in their societies, e.g. communicating with other people, accessing information on televisions, and learning in schools. Sign languages are geographically specific where

1 https://www.who.int/news-room/fact-sheets/detail/deafness-and-hearing-loss

 El-Sayed M. El-Alfy

[email protected] 1

King Fahd University of Petroleum and Minerals, Dhahran 31261, Saudi Arabia

Multimedia Tools and Applications

several languages exist around the world and sometimes within the same country [48]. Moreover, countries with similar spoken languages may even have different sign languages such as American and British Sign Languages. Sign language is one form of visual gesture communication that simultaneously employs hand gestures, facial expressions, and postures of other body parts to convey some meaning [32]. Hand gestures are the primary component of a sign language and are utilized by individuals to express their emotions and thoughts [11]. Hand gestures in a sign language can be classified into static and dynamic gestures, depending on whether or not hand motion is a part of the sign interpretation. Static gestures complement dynamic gestures and depend mainly on the shape and orientation of hand and fingers [35]. Static gestures in sign langu