Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking

Discriminative Correlation Filters (DCF) have demonstrated excellent performance for visual object tracking. The key to their success is the ability to efficiently exploit available negative data by including all shifted versions of a training sample. How

PDF / 1,636,664 Bytes
17 Pages / 439.37 x 666.142 pts Page_size
57 Downloads / 228 Views

DOWNLOAD

REPORT

Abstract. Discriminative Correlation Filters (DCF) have demonstrated excellent performance for visual object tracking. The key to their success is the ability to eﬃciently exploit available negative data by including all shifted versions of a training sample. However, the underlying DCF formulation is restricted to single-resolution feature maps, signiﬁcantly limiting its potential. In this paper, we go beyond the conventional DCF framework and introduce a novel formulation for training continuous convolution ﬁlters. We employ an implicit interpolation model to pose the learning problem in the continuous spatial domain. Our proposed formulation enables eﬃcient integration of multi-resolution deep feature maps, leading to superior results on three object tracking benchmarks: OTB-2015 (+5.1 % in mean OP), Temple-Color (+4.6 % in mean OP), and VOT2015 (20 % relative reduction in failure rate). Additionally, our approach is capable of sub-pixel localization, crucial for the task of accurate feature point tracking. We also demonstrate the eﬀectiveness of our learning formulation in extensive feature point tracking experiments.

1

Introduction

Visual tracking is the task of estimating the trajectory of a target in a video. It is one of the fundamental problems in computer vision. Tracking of objects or feature points has numerous applications in robotics, structure-from-motion, and visual surveillance. In recent years, Discriminative Correlation Filter (DCF) based approaches have shown outstanding results on object tracking benchmarks [30,46]. DCF methods train a correlation ﬁlter for the task of predicting the target classiﬁcation scores. Unlike other methods, the DCF eﬃciently utilize all spatial shifts of the training samples by exploiting the discrete Fourier transform. Deep convolutional neural networks (CNNs) have shown impressive performance for many tasks, and are therefore of interest for DCF-based tracking. A CNN consists of several layers of convolution, normalization and pooling operations. Recently, activations from the last convolutional layers have been successfully employed for image classiﬁcation. Features from these deep convolutional Electronic supplementary material The online version of this chapter (doi:10. 1007/978-3-319-46454-1 29) contains supplementary material, which is available to authorized users. c Springer International Publishing AG 2016 B. Leibe et al. (Eds.): ECCV 2016, Part V, LNCS 9909, pp. 472–488, 2016. DOI: 10.1007/978-3-319-46454-1 29

Learning Continuous Convolution Operators for Visual Tracking

473

Fig. 1. Visualization of our continuous convolution operator, applied to a multiresolution deep feature map. The feature map (left) consists of the input RGB patch along with the ﬁrst and last convolutional layer of a pre-trained deep network. The second column visualizes the continuous convolution ﬁlters learned by our framework. The resulting continuous convolution outputs for each layer (third column) are combined into the ﬁnal continuous conﬁdence function (right) of the targ

Data Loading...

Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking

Recommend Documents

Asymmetric discriminative correlation filters for visual tracking

Learning spatial-temporally regularized complementary kernelized correlation filters for visual tracking

Joint Audio-Visual Tracking Using Particle Filters

Object Tracking with Multi-sample Correlation Filters

Visual object tracking based on residual network and cascaded correlation filters

Discriminative Context-Aware Correlation Filter Network for Visual Tracking

Hybrid nonlinear convolution filters for image recognition

Matrix Convolution Operators on Groups

Real-Time Visual Tracking: Promoting the Robustness of Correlation Filter Learning

A Meta-Q-Learning Approach to Discriminative Correlation Filter based Visual Tracking

Iris Recognition Using Correlation Filters

Convolution Singular Integral Operators on Lipschitz Surfaces