Target Response Adaptation for Correlation Filter Tracking
Most correlation filter (CF) based trackers utilize the circulant structure of the training data to learn a linear filter that best regresses this data to a hand-crafted target response. These circularly shifted patches are only approximations to actual t
- PDF / 6,868,940 Bytes
- 15 Pages / 439.37 x 666.142 pts Page_size
- 1 Downloads / 223 Views
Abstract. Most correlation filter (CF) based trackers utilize the circulant structure of the training data to learn a linear filter that best regresses this data to a hand-crafted target response. These circularly shifted patches are only approximations to actual translations in the image, which become unreliable in many realistic tracking scenarios including fast motion, occlusion, etc. In these cases, the traditional use of a single centered Gaussian as the target response impedes tracker performance and can lead to unrecoverable drift. To circumvent this major drawback, we propose a generic framework that can adaptively change the target response from frame to frame, so that the tracker is less sensitive to the cases where circular shifts do not reliably approximate translations. To do that, we reformulate the underlying optimization to solve for both the filter and target response jointly, where the latter is regularized by measurements made using actual translations. This joint problem has a closed form solution and thus allows for multiple templates, kernels, and multi-dimensional features. Extensive experiments on the popular OTB100 benchmark show that our target adaptive framework can be combined with many CF trackers to realize significant overall performance improvement (ranging from 3 %–13.5 % in precision and 3.2 %–13 % in accuracy), especially in categories where this adaptation is necessary (e.g. fast motion, motion blur, etc.). Keywords: Correlation filter tracking
1
· Adaptive target design
Introduction
Visual object tracking is a classical problem in computer vision. It plays an important role in a plethora of applications, such as robotics, surveillance, and human-computer interaction to name a few. Object tracking can be defined as the task of localizing an object of interest (e.g. by an upright bounding box) in every frame starting from a given patch containing the object in the first frame. The problem is very challenging because the object could undergo a variety of Electronic supplementary material The online version of this chapter (doi:10. 1007/978-3-319-46466-4 25) contains supplementary material, which is available to authorized users. c Springer International Publishing AG 2016 B. Leibe et al. (Eds.): ECCV 2016, Part VI, LNCS 9910, pp. 419–433, 2016. DOI: 10.1007/978-3-319-46466-4 25
420
A. Bibi et al.
Fig. 1. Shows examples where circular shifts do not represent actual translations. Patches (a) and (b) of video Lemming show the object in two consecutive frames, where the target was partially occluded and the occluder is within the filter window. The circular shift corresponding to the actual translation of the object in the next frame is given in patch (c). Note that both the occluder and target are shifted. Circ(x, n), and Tran(x, n) denote n circular shifts and actual translations applied to the patch x, respectively. Similarly, we show patches (d) and (e) of video Coke of two consecutive frames, where fast motion and partial occlusion occur. The corresponding circular shift is give
Data Loading...