Unsupervised Deep Domain Adaptation for Pedestrian Detection

This paper addresses the problem of unsupervised domain adaptation on the task of pedestrian detection in crowded scenes. First, we utilize an iterative algorithm to iteratively select and auto-annotate positive pedestrian samples with high confidence as

  • PDF / 2,524,921 Bytes
  • 16 Pages / 439.37 x 666.142 pts Page_size
  • 62 Downloads / 290 Views

DOWNLOAD

REPORT


2

Shanghai Jiao Tong University, Shanghai, China [email protected] ITC-EOS, University of Twente, Enschede, The Netherlands

Abstract. This paper addresses the problem of unsupervised domain adaptation on the task of pedestrian detection in crowded scenes. First, we utilize an iterative algorithm to iteratively select and auto-annotate positive pedestrian samples with high confidence as the training samples for the target domain. Meanwhile, we also reuse negative samples from the source domain to compensate for the imbalance between the amount of positive samples and negative samples. Second, based on the deep network we also design an unsupervised regularizer to mitigate influence from data noise. More specifically, we transform the last fully connected layer into two sub-layers — an element-wise multiply layer and a sum layer, and add the unsupervised regularizer to further improve the domain adaptation accuracy. In experiments for pedestrian detection, the proposed method boosts the recall value by nearly 30 % while the precision stays almost the same. Furthermore, we perform our method on standard domain adaptation benchmarks on both supervised and unsupervised settings and also achieve state-of-the-art results. Keywords: Unsupervised domain adaptation izer · Deep neural network · People detection

1

·

Unsupervised regular-

Introduction

Deep neural networks have shown great power on traditional computer vision tasks, however, the labelled dataset should be large enough to train a reliable deep model. The annotation process for the task of pedestrian detection in crowded scenes is even more resource consuming, because we need to label concrete locations of pedestrian instances. In modern society, there are over millions of cameras deployed for surveillance. However, these surveillance situations vary in lights, background, viewpoints, camera resolutions and so on. Directly utilizing models trained on old scenes will result in poor performance on new situations due to data distribution changes. It is also unpractical to annotate pedestrian instances for every surveillance situation. When there are few or no labelled data in the target domain, domain adaptation helps to reduce the amount of labelled data needed. Basically, unsupervised domain adaptation aims to shift the model trained from the source domain to c Springer International Publishing Switzerland 2016  G. Hua and H. J´ egou (Eds.): ECCV 2016 Workshops, Part II, LNCS 9914, pp. 676–691, 2016. DOI: 10.1007/978-3-319-48881-3 48

Unsupervised Deep Domain Adaptation for Pedestrian Detection

677

the target domain for which only unlabelled data are provided. Most traditional works [1–5] either learn a shared representation between the source and target domain, or project features into a common subspace. Recently, there are also works [6–8] proposed to learn a scene-specific detector by deep architectures. However, heuristic methods are needed either for constructing feature space or re-weighting samples. Our motivation of developing a domain adaptation arc