Human Re-identification in Crowd Videos Using Personal, Social and Environmental Constraints

This paper addresses the problem of human re-identification in videos of dense crowds. Re-identification in crowded scenes is a challenging problem due to large number of people and frequent occlusions, coupled with changes in their appearance due to diff

  • PDF / 4,270,218 Bytes
  • 18 Pages / 439.37 x 666.142 pts Page_size
  • 2 Downloads / 135 Views

DOWNLOAD

REPORT


Abstract. This paper addresses the problem of human re-identification in videos of dense crowds. Re-identification in crowded scenes is a challenging problem due to large number of people and frequent occlusions, coupled with changes in their appearance due to different properties and exposure of cameras. To solve this problem, we model multiple Personal, Social and Environmental (PSE) constraints on human motion across cameras in crowded scenes. The personal constraints include appearance and preferred speed of each individual, while the social influences are modeled by grouping and collision avoidance. Finally, the environmental constraints model the transition probabilities between gates (entrances/exits). We incorporate these constraints into an energy minimization for solving human re-identification. Assigning 1–1 correspondence while modeling PSE constraints is NP-hard. We optimize using a greedy local neighborhood search algorithm to restrict the search space of hypotheses. We evaluated the proposed approach on several thousand frames of PRID and Grand Central datasets, and obtained significantly better results compared to existing methods. Keywords: Video surveillance · Re-identification · Dense crowds Social constraints · Multiple cameras · Human tracking

1

·

Introduction

Human re-identification is a fundamental and crucial problem for multi-camera surveillance systems [17,49]. It involves re-identifying individuals after they leave field-of-view (FOV) of one camera and appear in FOV of another camera (see Fig. 1(a)). The investigation process of the Boston Marathon bombing serves to highlight the importance of re-identification in crowded scenes. Authorities had to sift through a mountain of footage from government surveillance cameras, private security cameras and imagery shot by bystanders on smart phones [22]. Therefore, automatic re-identification in dense crowds will allow successful monitoring and analysis of crowded events. Dense crowds are the most challenging scenario for human re-identification. For large number of people, appearance alone provides a weak cue. Often, people in crowds wear similar clothes that makes re-identification even harder (Fig. 1c). c Springer International Publishing AG 2016  B. Leibe et al. (Eds.): ECCV 2016, Part II, LNCS 9906, pp. 119–136, 2016. DOI: 10.1007/978-3-319-46475-6 8

120

S. Modiri Assari et al.

Fig. 1. (a) Our goal is to re-identify people leaving camera a at time t (top row) to when they appear in camera b at some time t + 1, t + 2, ... in the future. The invisible region between the cameras is not closed, which means people can leave one camera and never appear in the other camera. (b) We construct a graph between individuals in the two cameras, as shown with black lines. Some of the constraints are linear in nature (appearance, speed, destination) while others are quadratic (spatial and social grouping, collision avoidance). The quadratic constraints are shown in red and capture relationships between matches. In (c), the people in black boxes are from camera