Multi-person Tracking by Multicut and Deep Matching

In Tang et al. (2015), we proposed a graph-based formulation that links and clusters person hypotheses over time by solving a minimum cost subgraph multicut problem. In this paper, we modify and extend Tang et al. (2015) in three ways: (1) We introduce a

PDF / 3,603,290 Bytes
12 Pages / 439.37 x 666.142 pts Page_size
92 Downloads / 248 Views

DOWNLOAD

REPORT

Abstract. In Tang et al. (2015), we proposed a graph-based formulation that links and clusters person hypotheses over time by solving a minimum cost subgraph multicut problem. In this paper, we modify and extend Tang et al. (2015) in three ways: (1) We introduce a novel local pairwise feature based on local appearance matching that is robust to partial occlusion and camera motion. (2) We perform extensive experiments to compare diﬀerent pairwise potentials and to analyze the robustness of the tracking formulation. (3) We consider a plain multicut problem and remove outlying clusters from its solution. This allows us to employ an eﬃcient primal feasible optimization algorithm that is not applicable to the subgraph multicut problem of Tang et al. (2015). Unlike the branch-and-cut algorithm used there, this eﬃcient algorithm used here is applicable to long videos and many detections. Together with the novel pairwise feature, it eliminates the need for the intermediate tracklet representation of Tang et al. (2015). We demonstrate the eﬀectiveness of our overall approach on the MOT16 benchmark (Milan et al. 2016), achieving state-of-art performance.

1

Introduction

Multi person tracking is a problem studied intensively in computer vision. While continuous progress has been made, false positive detections, long-term occlusions and camera motion remain challenging, especially for people tracking in crowded scenes. Tracking-by-detection is commonly used for multi person tracking where a state-of-the-art person detector is employed to generate detection hypotheses for a video sequence. In this case tracking essentially reduces to an association task between detection hypotheses across video frames. This detection association task is often formulated as an optimization problem with respect to a graph: every detection is represented by a node; edges connect detections across time frames. The most commonly employed algorithms aim to ﬁnd disjoint paths in such a graph [1–4]. The feasible solutions of such problems are sets of disjoint paths which do not branch or merge. While being intuitive, such formulations cannot handle the multiple plausible detections per person, which are generated from typical person detectors. Therefore, pre- and/or post-processing such as non maximum suppression (NMS) on the detections and/or the ﬁnal tracks is performed, which often requires careful ﬁne-tuning of parameters. c Springer International Publishing Switzerland 2016 G. Hua and H. J´ egou (Eds.): ECCV 2016 Workshops, Part II, LNCS 9914, pp. 100–111, 2016. DOI: 10.1007/978-3-319-48881-3 8

Multi-person Tracking by Multicut and Deep Matching

101

The minimum cost subgraph multicut problem proposed in [5] is an abstraction of the tracking problem that diﬀers conceptually from disjoint path methods. It has two main advantages: (1) Instead of ﬁnding a path for each person in the graph, it links and clusters multiple plausible person hypotheses (detections) jointly over time and space. The feasible solutions of this formulation are components

Data Loading...

Multi-person Tracking by Multicut and Deep Matching

Recommend Documents

Multicut

Pronunciation Similarity Matching Using Deep Learning

Fair-by-design matching

TRADI: Tracking Deep Neural Network Weight Distributions

PG-Net: Pixel to Global Matching Network for Visual Tracking

Visual Vehicle Tracking via Deep Learning and Particle Filter

Object Detection and Tracking with UAV Data Using Deep Learning

Tracking the Race Between Deep Reinforcement Learning and Imitation Learning

Deep Graph Matching via Blackbox Differentiation of Combinatorial Solvers

Temporal Keypoint Matching and Refinement Network for Pose Estimation and Tracking

Matching Properties of Deep Sub-Micron MOS Transistors

Target Detection and Tracking by Bionanosensor Networks