General Automatic Human Shape and Motion Capture Using Volumetric Contour Cues
Markerless motion capture algorithms require a 3D body with properly personalized skeleton dimension and/or body shape and appearance to successfully track a person. Unfortunately, many tracking methods consider model personalization a different problem a
- PDF / 7,932,449 Bytes
- 18 Pages / 439.37 x 666.142 pts Page_size
- 81 Downloads / 161 Views
2
MPI Informatik, Saarbr¨ ucken, Germany {hrhodin,theobalt}@mpi-inf.mpg.de Intel Visual Computing Institute, Saarbr¨ ucken, Germany
Abstract. Markerless motion capture algorithms require a 3D body with properly personalized skeleton dimension and/or body shape and appearance to successfully track a person. Unfortunately, many tracking methods consider model personalization a different problem and use manual or semi-automatic model initialization, which greatly reduces applicability. In this paper, we propose a fully automatic algorithm that jointly creates a rigged actor model commonly used for animation – skeleton, volumetric shape, appearance, and optionally a body surface – and estimates the actor’s motion from multi-view video input only. The approach is rigorously designed to work on footage of general outdoor scenes recorded with very few cameras and without background subtraction. Our method uses a new image formation model with analytic visibility and analytically differentiable alignment energy. For reconstruction, 3D body shape is approximated as a Gaussian density field. For pose and shape estimation, we minimize a new edge-based alignment energy inspired by volume ray casting in an absorbing medium. We further propose a new statistical human body model that represents the body surface, volumetric Gaussian density, and variability in skeleton shape. Given any multi-view sequence, our method jointly optimizes the pose and shape parameters of this model fully automatically in a spatiotemporal way.
1
Introduction
Markerless full-body motion capture techniques refrain from markers used in most commercial solutions, and promise to be an important enabling technique in computer animation and visual effects production, in sports and biomechanics research, and the growing fields of virtual and augmented reality. While early markerless methods were confined to indoor use in more controlled scenes and backgrounds recorded with eight or more cameras [1], recent methods succeed in general outdoor scenes with much fewer cameras [2,3]. Electronic supplementary material The online version of this chapter (doi:10. 1007/978-3-319-46454-1 31) contains supplementary material, which is available to authorized users. c Springer International Publishing AG 2016 B. Leibe et al. (Eds.): ECCV 2016, Part V, LNCS 9909, pp. 509–526, 2016. DOI: 10.1007/978-3-319-46454-1 31
510
H. Rhodin et al.
Fig. 1. Method overview. Pose is estimated from detections in Stage I, actor shape and pose is refined through contour alignment in Stage II by space-time optimization. Outputs are the actor skeleton, attached density, mesh and motion.
Before motion capture commences, the 3D body model for tracking needs to be personalized to the captured human. This includes personalization of the bone lengths, but often also of biomechanical shape and surface, including appearance. This essential initialization is, unfortunately, neglected by many methods and solved with an entirely different approach, or with specific and complex manual or semi-automatic
Data Loading...