Multi-person Pose Estimation with Local Joint-to-Person Associations

Despite of the recent success of neural networks for human pose estimation, current approaches are limited to pose estimation of a single person and cannot handle humans in groups or crowds. In this work, we propose a method that estimates the poses of mu

PDF / 8,477,071 Bytes
16 Pages / 439.37 x 666.142 pts Page_size
33 Downloads / 254 Views

DOWNLOAD

REPORT

Abstract. Despite of the recent success of neural networks for human pose estimation, current approaches are limited to pose estimation of a single person and cannot handle humans in groups or crowds. In this work, we propose a method that estimates the poses of multiple persons in an image in which a person can be occluded by another person or might be truncated. To this end, we consider multi-person pose estimation as a joint-to-person association problem. We construct a fully connected graph from a set of detected joint candidates in an image and resolve the joint-to-person association and outlier detection using integer linear programming. Since solving joint-to-person association jointly for all persons in an image is an NP-hard problem and even approximations are expensive, we solve the problem locally for each person. On the challenging MPII Human Pose Dataset for multiple persons, our approach achieves the accuracy of a state-of-the-art method, but it is 6,000 to 19,000 times faster.

1

Introduction

Single person pose estimation has made a remarkable progress over the past few years. This is mainly due to the availability of deep learning based methods for detecting joints [1–5]. While earlier approaches in this direction [4,6,7] combine the body part detectors with tree structured graphical models, more recent methods [1–3,8–10] demonstrate that spatial relations between joints can be directly learned by a neural network without the need of an additional graphical model. These approaches, however, assume that only a single person is visible in the image and the location of the person is known a-priori. Moreover, the number of parts are deﬁned by the network, e.g., full body or upper body, and cannot be changed. For realistic scenarios such assumptions are too strong and the methods cannot be applied to images that contain a number of overlapping and truncated persons. An example of such a scenario is shown in Fig. 1. In comparison to single person human pose estimation benchmarks, multiperson pose estimation introduces new challenges. The number of persons in an image is unknown and needs to be correctly estimated, the persons occlude each other and might be truncated, and the joints need to be associated to the correct person. The simplest approach to tackle this problem is to ﬁrst use a person detector and then estimate the pose for each detection independently [11–13]. c Springer International Publishing Switzerland 2016 G. Hua and H. J´ egou (Eds.): ECCV 2016 Workshops, Part II, LNCS 9914, pp. 627–642, 2016. DOI: 10.1007/978-3-319-48881-3 44

628

U. Iqbal and J. Gall

Fig. 1. Example image from the multi-person subset of the MPII Pose Dataset [16].

This, however, does not resolve the joint association problem of two persons next to each other or truncations. Other approaches estimate the pose of all detected persons jointly [14,15]. In [2] a person detector is not required. Instead body part proposals are generated and connected in a large graph. The approach then solves the labeling problem, th

Data Loading...

Multi-person Pose Estimation with Local Joint-to-Person Associations

Recommend Documents

Hand Pose Estimation from Local Surface Normals

Learning Delicate Local Representations for Multi-person Pose Estimation

3D Human Pose Estimation with 2D Human Pose and Depthmap

3D Pose Estimation

Face Pose Estimation

2D Body Pose Estimation

Face Recognition and Pose Estimation with Parametric Linear Subspaces

Multi-level Prediction with Graphical Model for Human Pose Estimation

Enhancing feature fusion for human pose estimation

Human Pose Estimation Using Deep Consensus Voting

Motion Guided 3D Pose Estimation from Videos

Human Upper Body Pose Region Estimation