On the Exploration of Joint Attribute Learning for Person Re-identification

This paper presents an algorithm for jointly learning a set of mid-level attributes from an image ensemble by locating clusters of dependent attributes. Human describable attributes are an active research topic due to their ability to transfer between dom

  • PDF / 820,110 Bytes
  • 16 Pages / 439.37 x 666.142 pts Page_size
  • 55 Downloads / 140 Views

DOWNLOAD

REPORT


Abstract. This paper presents an algorithm for jointly learning a set of mid-level attributes from an image ensemble by locating clusters of dependent attributes. Human describable attributes are an active research topic due to their ability to transfer between domains, human understanding, and improvement to identification performance. Joint learning may allow for enhanced attribute classification when there is inherent dependency among the attributes. We propose an agglomerative clustering scheme to determine which sets of attributes should be learned jointly in order to maximize the margin of performance improvement. We evaluate the joint learning algorithm on a set of attributes for the task of person re-identification. We find that the proposed algorithm can improve classifier accuracy over both independent or fully joint attribute classification. Furthermore, the enhanced classifiers also improve performance on the person re-identification task. Our algorithm can be widely applicable to a variety of attribute-based visual recognition problems.

1

Introduction

Person re-identification seeks to locate the same individual across multiple nonoverlapping cameras within a short time frame [1]. As an enabling technique for video surveillance [2,3], it has many applications such as tracker linking, person retrieval, searching missing children in public spaces, etc. Depending on the applications, person re-identification can be posed in different scenarios. For example, classic person re-identification is image-to-image matching where one image is the occurrence of the person of interest in one of the cameras. Zero-shot identification is description-to-image matching where the only prior knowledge is a verbal description by an eyewitness. While many prior work of person re-identification rely on low-level visual feature based image matching [4–8], recently human describable, mid-level attributes have become a promising approach for both re-identification [9] and zero-shot identification [10] scenarios. This is especially true for the latter where describable attributes are the only source of input information. These attributes have a number of advantages over low-level visual features. First, they enable the possibility of human-in-the-loop to assist decision making. Second, they can improve c Springer International Publishing Switzerland 2015  D. Cremers et al. (Eds.): ACCV 2014, Part I, LNCS 9003, pp. 673–688, 2015. DOI: 10.1007/978-3-319-16865-4 44

J. Roth and X. Liu

G2

G1

Algorithm

bald

darkhair

hasbackpack

redshirt

G3 darkshirt

+

red shirt male bald

greenshirt

674

lightshirt hasbackpack male

Application

Fig. 1. Given an image ensemble with labels on a set of attributes, our algorithm automatically partitions the attribute set into various clusters and jointly learns a classifier for multiple attributes within each cluster. This leads to superior performance in both attribute classification and person re-identification application (e.g., zero-shot identification).

the system performance by fusion with low-level