Echo State Networks and Long Short-Term Memory for Continuous Gesture Recognition: a Comparative Study
- PDF / 1,868,774 Bytes
- 13 Pages / 595.224 x 790.955 pts Page_size
- 6 Downloads / 189 Views
Echo State Networks and Long Short-Term Memory for Continuous Gesture Recognition: a Comparative Study Doreen Jirak1 · Stephan Tietz2 · Hassan Ali1 · Stefan Wermter1 Received: 20 February 2020 / Accepted: 15 July 2020 © The Author(s) 2020
Abstract Recent developments of sensors that allow tracking of human movements and gestures enable rapid progress of applications in domains like medical rehabilitation or robotic control. Especially the inertial measurement unit (IMU) is an excellent device for real-time scenarios as it rapidly delivers data input. Therefore, a computational model must be able to learn gesture sequences in a fast yet robust way. We recently introduced an echo state network (ESN) framework for continuous gesture recognition (Tietz et al., 2019) including novel approaches for gesture spotting, i.e., the automatic detection of the start and end phase of a gesture. Although our results showed good classification performance, we identified significant factors which also negatively impact the performance like subgestures and gesture variability. To address these issues, we include experiments with Long Short-Term Memory (LSTM) networks, which is a state-of-the-art model for sequence processing, to compare the obtained results with our framework and to evaluate their robustness regarding pitfalls in the recognition process. In this study, we analyze the two conceptually different approaches processing continuous, variablelength gesture sequences, which shows interesting results comparing the distinct gesture accomplishments. In addition, our results demonstrate that our ESN framework achieves comparably good performance as the LSTM network but has significantly lower training times. We conclude from the present work that ESNs are viable models for continuous gesture recognition delivering reasonable performance for applications requiring real-time performance as in robotic or rehabilitation tasks. From our discussion of this comparative study, we suggest prospective improvements on both the experimental and network architecture level. Keywords Continuous gesture recognition · Echo state networks · Long Short-Term Memory
Introduction Continuous gesture recognition is a challenging task due to three critical aspects: (1) the correct identification of the start and end of the actual gesture, called subgesture, (2) the recognition of a gesture of possibly variable length, also called inter-subject variability, and (3) the accurate distinction between an active gesture and subtle movements or silent phases like pauses. The correct yet fast recognition of gestures is an important research area predominantly Doreen Jirak
[email protected] 1
Department of Informatics, Knowledge Technology, University of Hamburg, Vogt-K¨olln-Str. 30, 22527 Hamburg, Germany
2
Technical University of Berlin, Strasse des 17. Juni 135, 10623 Berlin, Germany
for vision-based application in human-robot interaction (HRI) or human-computer interaction (HCI). Although visual gesture recognition allows the most intu
Data Loading...