T-MAN: a neural ensemble approach for person re-identification using spatio-temporal information

  • PDF / 2,472,464 Bytes
  • 17 Pages / 439.642 x 666.49 pts Page_size
  • 58 Downloads / 172 Views

DOWNLOAD

REPORT


T-MAN: a neural ensemble approach for person re-identification using spatio-temporal information Nirbhay Kumar Tagore1 · Pratik Chattopadhyay1

· Lipo Wang2

Received: 17 January 2020 / Revised: 10 July 2020 / Accepted: 21 July 2020 / © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Person re-identification plays a central role in tracking and monitoring crowd movement in public places, and hence it serves as an important means for providing public security in video surveillance application sites. The problem of person re-identification has received significant attention in the past few years, and with the introduction of deep learning, several interesting approaches have been developed. In this paper, we propose an ensemble model called Temporal Motion Aware Network (T-MAN) for handling the visual context and spatio-temporal information jointly from the input video sequences. Our methodology makes use of the long-range motion context with recurrent information for establishing correspondences among multiple cameras. The proposed T-MAN approach first extracts explicit frame-level feature descriptors from a given video sequence by using three different sub-networks (FPAN, MPN, and LSTM), and then aggregates these models using an ensemble technique to perform re-identification. The method has been evaluated on three publicly available data sets, namely, the PRID-2011, iLIDS-VID, and MARS, and re-identification accuracy of 83.0%, 73.5%, and 83.3% have been obtained from these three data sets, respectively. Experimental results emphasize the effectiveness of our approach and its superiority over the state-of-the-art techniques for video-based person re-identification. Keywords Spatio-temporal information · Ensemble model · Person re-identification · Deep learning  Pratik Chattopadhyay

[email protected] Nirbhay Kumar Tagore [email protected] Lipo Wang [email protected] 1

Pattern Recognition Lab, Department of Computer Science and Engineering, Indian Institute of Technology (BHU), Varanasi, 221005, Uttar Pradesh, India

2

School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, 639798, Singapore

Multimedia Tools and Applications

1 Introduction Person re-identification is a process of establishing one-one correspondence among the images of individuals captured by non-overlapping cameras at different points of time. A basic re-identification system can be broadly segregated into three phases, i.e., person detection, tracking, and final retrieval. The problem of person re-identification has got expanding consideration [58] [27], which intends to identify an individual captured by one camera in the field of view of another camera positioned at a different place. Computer vision-based person re-identification must be robust against variation of lighting, postures, perspectives, etc. Also, the continuous recording of videos from camera network results in a huge volume of data, manual monitoring of which is time-intensive and error-