Convolutional neural network with spatial pyramid pooling for hand gesture recognition
- PDF / 4,174,796 Bytes
- 13 Pages / 595.276 x 790.866 pts Page_size
- 94 Downloads / 219 Views
(0123456789().,-volV)(0123456789(). ,- volV)
ORIGINAL ARTICLE
Convolutional neural network with spatial pyramid pooling for hand gesture recognition Yong Soon Tan1 • Kian Ming Lim1
•
Connie Tee1 • Chin Poo Lee1 • Cheng Yaw Low1
Received: 22 October 2019 / Accepted: 2 September 2020 Ó Springer-Verlag London Ltd., part of Springer Nature 2020
Abstract Hand gesture provides a means for human to interact through a series of gestures. While hand gesture plays a significant role in human–computer interaction, it also breaks down the communication barrier and simplifies communication process between the general public and the hearing-impaired community. This paper outlines a convolutional neural network (CNN) integrated with spatial pyramid pooling (SPP), dubbed CNN–SPP, for vision-based hand gesture recognition. SPP is discerned mitigating the problem found in conventional pooling by having multi-level pooling stacked together to extend the features being fed into a fully connected layer. Provided with inputs of varying sizes, SPP also yields a fixed-length feature representation. Extensive experiments have been conducted to scrutinize the CNN–SPP performance on two wellknown American sign language (ASL) datasets and one NUS hand gesture dataset. Our empirical results disclose that CNN–SPP prevails over other deep learning-driven instances. Keywords Convolutional neural network (CNN) Spatial pyramid pooling (SPP) Hand gesture recognition Sign language recognition
1 Introduction Hand gesture recognition not only has the ability to transform human–computer interaction (HCI) but also simplifies communication with each other, especially the communication process between the deaf community with the general public. People with deafness communicate with others primarily through a language of hand gesture, formally known as sign language. Despite its well-known existence, only minority of the general public have the knowledge to interpret sign language. Hand gesture recognition system breaks down this communication barrier by allowing people to understand the meaning behind different sign language gestures effortlessly. There are many challenges in vision-based hand gesture recognition, & Kian Ming Lim [email protected] Yong Soon Tan [email protected] 1
Faculty of Information Science and Technology (FIST), Multimedia University, Jalan Ayer Keroh Lama, 75450 Melaka, Malaysia
for example, hand size variation, skin tone and color, illumination, view point variation, similarity in different gestures and the complex natural background. In short, recognizing static hand gesture in images can be very challenging due to the diverse and varied conditions presented in the images. In recent years, many works have been carried out to address these problems.
2 Related works Prior to the success of deep learning, hand-crafted feature extraction methods with classifier were much more popular, and widely adopted approach. Hand-crafted feature extraction methods extract useful features from images, and the extracted features are t
Data Loading...