Segmentation mask-guided person image generation

  • PDF / 3,301,422 Bytes
  • 16 Pages / 595.276 x 790.866 pts Page_size
  • 116 Downloads / 263 Views

DOWNLOAD

REPORT


Segmentation mask-guided person image generation Meichen Liu 1 & Xin Yan 1 & Chenhui Wang 2 & Kejun Wang 1

# Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Background clutters and pose variation are the key factors which prevents the network from learning a robust Person reidentification (Re-ID) model. To address the problem above, we first introduce the binary segmentation mask to construct the body region served as the input of the generator, then design a segmentation mask-guided person image generation network for the pose transfer. The binary segmentation mask has the capability of removing the background clutters in pixel-level, and contains more details about the edge information, where better shape consistency can be achieved for the generated image with the input image. Compared with the previous methods, the proposed method can dramatically improve the model adaptive ability and deal with the diversity of postures. In addition, we design a lightweight attention mechanism module as a guider module, which can assist the generator to focus on the discriminative features of pedestrians. The experiment results are introduced to demonstrate the effectiveness of the proposed method and the superiority performance over most state-of-the-art methods without over-computing in the design process of the Re-ID model. It is worth mentioning that our ideas can be easily combined with other fields to solve the phenomenon of the current situation with insufficient pose variations in the datasets. Keywords Pose transferrable . Segmentation mask . Generative adversarial networks . Person re-identification

1 Introduction Person re-identification (Re-ID) aims to match person across no-overlapping video camera. This task has attracted considerable attention for its application in automatic video surveillance [1–4]. The variations in pose, viewpoints, illumination, and occlusion make Re-ID a challenging task. Early studies in Re-ID task either focus on the discriminative hand-craft feature or the robust distance metric for similarity measure [5–7]. With development in deep learning [8–10], Re-ID has made great progress in recent

* Chenhui Wang [email protected] * Kejun Wang [email protected] Meichen Liu [email protected] Xin Yan [email protected] 1

College of Automation, Harbin Engineering University, Harbin 150001, China

2

Department of Statistics, University of California, Los Angeles 90095-1554, USA

years. Most of methods employ deep neural networks(DNNs) to learn the discriminative feature or design the appropriate loss function [11–14]. However, since the existing benchmarks such as Market-1501 [15], DukeMTMC-reID [16] and CUHK03 [17] contain a limited number of pose changes, pose variation becomes one of the key factors which prevents the network from learning a robust Re-ID model. Therefore, in this paper, we focus on transferring a person from one pose to another. Most of the previous pose transfer approaches adopt keypoint-based pose representation, such as [11, 18–21, 22, 23]