Human action recognition using deep rule-based classifier

  • PDF / 1,339,521 Bytes
  • 15 Pages / 439.642 x 666.49 pts Page_size
  • 39 Downloads / 243 Views

DOWNLOAD

REPORT


Human action recognition using deep rule-based classifier Allah Bux Sargano1,2

· Xiaowei Gu2 · Plamen Angelov2 · Zulfiqar Habib1

Received: 17 August 2019 / Revised: 19 June 2020 / Accepted: 16 July 2020 / © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract In recent years, numerous techniques have been proposed for human activity recognition (HAR) from images and videos. These techniques can be divided into two major categories: handcrafted and deep learning. Deep Learning-based models have produced remarkable results for HAR. However, these models have several shortcomings, such as the requirement for a massive amount of training data, lack of transparency, offline nature, and poor interpretability of their internal parameters. In this paper, a new approach for HAR is proposed, which consists of an interpretable, self-evolving, and self-organizing set of 0-order If...THEN rules. This approach is entirely data-driven, and non-parametric; thus, prototypes are identified automatically during the training process. To demonstrate the effectiveness of the proposed method, a set of high-level features is obtained using a pre-trained deep convolution neural network model, and a recently introduced deep rule-based classifier is applied for classification. Experiments are performed on a challenging benchmark dataset UCF50; results confirmed that the proposed approach outperforms state-of-the-art methods. In addition to this, an ablation study is conducted to demonstrate the efficacy of the proposed approach by comparing the performance of our DRB classifier with four state-ofthe-art classifiers. This analysis revealed that the DRB classifier could perform better than state-of-the-art classifiers, even with limited training samples. Keywords Human action recognition · Deep learning · Fuzzy rule-based classifier

1 Introduction Over the past three decades, human activity recognition (HAR) has been an active research area due to its numerous applications in assisted living, video surveillance, video search, and human-robot interaction, [49]. Initially, the research was focused on simple datasets recorded under controlled settings, e.g., Weizmann [26], and KTH [51]. This was mainly due to the unavailability of datasets and the required computing resources for processing the data. However, with the rapid increase of video content, there has been a strong urge to  Allah Bux Sargano

[email protected] 1

Department of Computer Science, COMSATS University Islamabad, Lahore, Pakistan

2

School of Computing and Communications Infolab21, Lancaster University, Bailrigg, UK

Multimedia Tools and Applications

understand the contents of realistic videos. As a consequence, more realistic and challenging HAR video datasets such as UCF Sports [55], UCF50 [46], and HMDB51 [34] were developed. This is considered a step forward for the development of real-world systems for HAR. However, developing robust algorithms for realistic environments is a challenging task and needs further attention of the research