Modality-transfer generative adversarial network and dual-level unified latent representation for visible thermal Person

  • PDF / 2,289,793 Bytes
  • 16 Pages / 595.276 x 790.866 pts Page_size
  • 75 Downloads / 159 Views

DOWNLOAD

REPORT


ORIGINAL ARTICLE

Modality-transfer generative adversarial network and dual-level unified latent representation for visible thermal Person re-identification Xing Fan1 · Wei Jiang1

· Hao Luo1 · Weijie Mao1

Accepted: 4 November 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract Visible thermal person re-identification, also known as RGB-infrared person re-identification, is an emerging cross-modality searching problem that identifies the same person from different modalities. To solve this problem, it is necessary to know what a person looks like in different modalities. Images of the same person at the same time from the same camera view in both modalities should be captured, so that similarities and differences could be discovered. However, existing datasets do not completely satisfy those requirements. Thus, a modality-transfer generative adversarial network is proposed to generate a cross-modality counterpart for a source image in the target modality, obtaining paired images for the same person. Given that query images are from one modality and gallery images are from another modality, it is necessary to produce a unified representation for both modalities so cross-modality matching could be performed. In this study, a novel dual-level unified latent representation is proposed for visible thermal person re-identification task, including an image-level patch fusion strategy and a feature-level hierarchical granularity triplet loss, producing a more general and robust unified feature embedding. Extensive experiments on both the SYSU-MM01 dataset (with visible and near-infrared images) and the RegDB dataset (with visible and far-infrared images) demonstrate the efficiency and generality of the proposed method, which achieves state-of-the-art performance. The code will be publicly released. Keywords Visible thermal person re-identification · Cross-modality · Modality-transfer · Feature embedding

1 Introduction Person re-identification (ReID) aims to identify the same person in a camera network. Given a query image, it retrieves all images with the same identity from other cameras. Traditional person re-identification task usually handles cameras of the same type, i.e., visible cameras producing RGB images. Due to illumination changes, occlusion, pose variation, background distraction, and camera view difference, person ReID is a challenging task. Nevertheless, with the help of deep neural networks, which has demonstrated unrivaled superiority to the hand-crafted features [22], the visible person ReID task has achieved a huge performance boost [13,29,32,39,43], for example, the rank-1 accuracy

B 1

Wei Jiang [email protected] The State Key Laboratory of Industrial Control Technology, Zhejiang University, Hangzhou 310027, China

reaches 95.6% [39] in the widely used Market-1501 [47] visible ReID dataset. However, visible images captured in low-level lighting conditions such as night scenes or dark rooms, are not informative. Similar to human eyes, a visible camera cannot capture much