Unsupervised Learning Approach for Abnormal Event Detection in Surveillance Video by Hybrid Autoencoder

  • PDF / 2,212,723 Bytes
  • 15 Pages / 439.37 x 666.142 pts Page_size
  • 95 Downloads / 232 Views

DOWNLOAD

REPORT


Unsupervised Learning Approach for Abnormal Event Detection in Surveillance Video by Hybrid Autoencoder Fuqiang Zhou1 · Lin Wang1 · Zuoxin Li1 · Wangxia Zuo1 · Haishu Tan2

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Abstract Abnormal detection plays an important role in video surveillance. LSTM encoder–decoder is used to learn representation of video sequences and applied for detecting abnormal event in complex environment. The learned representation of LSTM encoder–decoder is learned from encoder, and it is crucial for decoder. However, LSTM encoder–decoder generally fails to account for the global context of the learned representation with a fixed dimension representation. In this paper, we explore a hybrid autoencoder architecture, which not only extracts better spatio-temporal context, but also improves the extrapolate capability of the corresponding decoder by the shortcut connection. The experiment shows that the hybrid model performs better than the state-of-the-art anomaly detection methods in both qualitative and quantitative ways on benchmark datasets. Keywords Autoencoder · LSTM · Abnormality detection

1 Introduction Video surveillance is widely used for various fields, such as security guards, medical monitoring, traffic monitoring, etc. Among these research fields, anomaly detection plays an important role in discovering various irregularities. Since many first-hand video datasets are obtained without labels and abnormal event is hard to define in advance as well, unsupervised model is more practical and many existing methods about abnormality detection focus on learning the normal pattern, and then abnormal events are identified as those which deviate from the normal ones. Many studies apply unsupervised learning algorithm to detect abnormal events. An unsupervised dynamic sparse coding approach is proposed in [1] for detecting unusual events in videos, the method is based on online sparse constructibility of query signals from an atomically learned event dictionary, which forms a sparse coding bases and involves spatio-

B

Fuqiang Zhou [email protected]

1

School of Instrumentation Science and Opto-electronics Engineering, Beihang University, Beijing 100191, China

2

Department of Electronic Information Engineering, Foshan University, Foshan 528000, China

123

F. Zhou et al.

temporal cuboids. Based on multi-scale histograms of optical flow, Cong et al. [2] use sparse coding reconstruction error as a measure for abnormality. An unsupervised probabilistic framework is developed and used to detect abnormal events based on non-parametric statistical notion of spatio-temporal locations and scales by Saligrama and Chen [3]. Ricci et al. [4] propose a convex hierarchical clustering approach which relies on Earth Mover’s prototype to analysis complex scenes. However, these methods employ hand-crafted features to model normal patterns. In recent years, deep learning becomes more and more popular as it can reduce the laborintensive effort in aspect of feature abstraction. In par