A novel approach for multi-cue feature fusion for robust object tracking

  • PDF / 18,581,232 Bytes
  • 18 Pages / 595.276 x 790.866 pts Page_size
  • 46 Downloads / 208 Views

DOWNLOAD

REPORT


A novel approach for multi-cue feature fusion for robust object tracking Ashish Kumar 1 & Gurjit Singh Walia 2 & Kapil Sharma 1

# Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Object tracking is a significant problem of computer vision due to challenging environmental variations. Single cue appearance model is not sufficient to handle the variations. To this end, we propose a multi-cue tracking framework in which complementary cues namely, LBP and HOG were exploited to develop a robust appearance model. The proposed feature fusion captures the high-level relationship between the features and diminishes the low-level relationship. Transductive reliability is also integrated at each frame to make tracker adaptive with the changing environment. In addition, K-Means based classifier creates clear and concise boundary between positive and negative fragments which are further used to update the reference dictionary. This adaptation strategy prevents the erroneous updation of the proposed tracker during background clutters, occlusion, and fast motion. Qualitative and quantitative analysis on challenging video sequences from OTB-100 dataset, VOT dataset and UAV123 reveal that the proposed tracker performs favorably against 13 others state-of-the-art trackers. Keywords Object tracking . Feature fusion . Classifier . Reliability

1 Introduction Object tracking is an imperative field of computer vision with wide range of applications in video surveillance, humancomputer interaction, augmented reality, robotics, and motion analysis. Tracking is detecting the object in the first frame and predicting its state in each subsequent frame of a video sequence. In recent years, a lot of methods have been proposed in this direction but it is still a challenging problem due to dynamic environmental conditions that include pose variations, scale variations, illumination variations, full or partial occlusion, and background clutters. Due to such dynamic environmental variations, single cue is not sufficient to handle the target appearance variations. Hence, multi-cues are necessary for building a robust appearance model. In object tracking, appearance model can be categorized into generative model and discriminative model. Trackers such as MTT [1], ASLA [2] and FRAG [3] are categorized under the generative model. These trackers track the target by * Kapil Sharma [email protected] 1

Delhi Technological University, Delhi, India

2

SAG, Defense Research and Development Organization, New Delhi 110042, India

searching for the most similar region. On the other hand, discriminative model considers tracking as a binary classification problem and separates the foreground from the background by training a classifier. WMIL [4], discriminative reverse sparse tracker [5], and CT [6] are categorized as discriminative trackers. Generally, Trackers under these methods combine the various extracted features either at score level or at feature level. Score level fusion combines the classifier scores for different features an