G-LBM: Generative Low-Dimensional Background Model Estimation from Video Sequences

In this paper, we propose a computationally tractable and theoretically supported non-linear low-dimensional generative model to represent real-world data in the presence of noise and sparse outliers. The non-linear low-dimensional manifold discovery of d

PDF / 4,764,220 Bytes
18 Pages / 439.37 x 666.142 pts Page_size
32 Downloads / 178 Views

DOWNLOAD

REPORT

Abstract. In this paper, we propose a computationally tractable and theoretically supported non-linear low-dimensional generative model to represent real-world data in the presence of noise and sparse outliers. The non-linear low-dimensional manifold discovery of data is done through describing a joint distribution over observations, and their low-dimensional representations (i.e. manifold coordinates). Our model, called generative low-dimensional background model (G-LBM) admits variational operations on the distribution of the manifold coordinates and simultaneously generates a low-rank structure of the latent manifold given the data. Therefore, our probabilistic model contains the intuition of the non-probabilistic low-dimensional manifold learning. GLBM selects the intrinsic dimensionality of the underling manifold of the observations, and its probabilistic nature models the noise in the observation data. G-LBM has direct application in the background scenes model estimation from video sequences and we have evaluated its performance on SBMnet-2016 and BMC2012 datasets, where it achieved a performance higher or comparable to other state-of-the-art methods while being agnostic to diﬀerent scenes. Besides, in challenges such as camera jitter and background motion, G-LBM is able to robustly estimate the background by eﬀectively modeling the uncertainties in video observations in these scenarios. (The code and models are available at: https://github.com/brezaei/G-LBM.) Keywords: Background estimation · Foreground segmentation · Non-linear manifold learning · Deep neural network · Variational auto-encoding

Electronic supplementary material The online version of this chapter (https:// doi.org/10.1007/978-3-030-58610-2_18) contains supplementary material, which is available to authorized users. c Springer Nature Switzerland AG 2020 A. Vedaldi et al. (Eds.): ECCV 2020, LNCS 12357, pp. 293–310, 2020. https://doi.org/10.1007/978-3-030-58610-2_18

294

1

B. Rezaei et al.

Introduction

Many high-dimensional real world datasets consist of data points coming from a lower-dimensional manifold corrupted by noise and possibly outliers. In particular, background in videos recorded by a static camera might be generated from a small number of latent processes that all non-linearly aﬀect the recorded video scenes. Linear multivariate analysis such as robust principal component analysis (RPCA) and its variants have long been used to estimate such underlying processes in the presence of noise and/or outliers in the measurements with large data matrices [6,17,41]. However, these linear processes may fail to ﬁnd the low-dimensional structure of the data when the mapping of the data into the latent space is non-linear. For instance background scenes in realworld videos lie on one or more non-linear manifolds, an investigation to this fact is presented in [16]. Therefore, a robust representation of the data should ﬁnd the underlying non-linear structure of the real-world data as well as its uncertainties. To this end, we propose a ge

Data Loading...

G-LBM: Generative Low-Dimensional Background Model Estimation from Video Sequences

Recommend Documents

MAP Estimation of Chin and Cheek Contours in Video Sequences

Background Estimation

Video coding with dynamic background

Recognition of Mexican Sign Language from Frames in Video Sequences

Learning Markerless Human Pose Estimation from Multiple Viewpoint Video

Efficient Algorithms for Model-Based Motif Discovery from Multiple Sequences

Generation of Musical Scores from Chord Sequences Using Neurodynamic Model

Deep Convolutional Generative Adversarial Networks for Flame Detection in Video

An Algorithm of Pig Segmentation from Top-View Infrared Video Sequences

Model-Based Synthesis of Visual Speech Movements from 3D Video

A Unified Framework for Micro-video BackGround Music Automatic Matching

Video-to-Video Dynamic Super-Resolution for Grayscale and Color Sequences