Automatic Calibration of Stationary Surveillance Cameras in the Wild

We present a fully automatic camera calibration algorithm for monocular stationary surveillance cameras. We exploit only information from pedestrians tracks and generate a full camera calibration matrix based on vanishing-point geometry. This paper presen

  • PDF / 2,168,201 Bytes
  • 17 Pages / 439.37 x 666.142 pts Page_size
  • 53 Downloads / 258 Views

DOWNLOAD

REPORT


2

ViNotion B.V., Eindhoven, The Netherlands {guido.brouwers,rob.wijnhoven}@vinotion.nl Eindhoven University of Technology, Eindhoven, The Netherlands {m.zwemer,p.h.n.de.With}@tue.nl

Abstract. We present a fully automatic camera calibration algorithm for monocular stationary surveillance cameras. We exploit only information from pedestrians tracks and generate a full camera calibration matrix based on vanishing-point geometry. This paper presents the first combination of several existing components of calibration systems from literature. The algorithm introduces novel pre- and post-processing stages that improve estimation of the horizon line and the vertical vanishing point. The scale factor is determined using an average body height, enabling extraction of metric information without manual measurement in the scene. Instead of evaluating performance on a limited number of camera configurations (video seq.) as in literature, we have performed extensive simulations of the calibration algorithm for a large range of camera configurations. Simulations reveal that metric information can be extracted with an average error of 1.95 % and the derived focal length is more accurate than the reported systems in literature. Calibration experiments with real-world surveillance datasets in which no restrictions are made on pedestrian movement and position, show that the performance is comparable (max. error 3.7 %) to the simulations, thereby confirming feasibility of the system. Keywords: Automatic camera calibration

1

· Vanishing points

Introduction

The growth of video cameras for surveillance and security implies more automatic analysis using object detection and tracking of moving objects in the scene. To obtain a global understanding of the environment, individual detection results from multiple cameras can be combined. For more accurate global understanding, it is required to convert the pixel-based position information of detected objects in the individual cameras, to a global coordinate system (GPS). To this end, each individual camera needs to be calibrated as a first and crucial step. The most common model to relate pixel positions to real-world coordinates is the pinhole camera model [5]. In this model, the camera is assumed to make c Springer International Publishing Switzerland 2016  G. Hua and H. J´ egou (Eds.): ECCV 2016 Workshops, Part II, LNCS 9914, pp. 743–759, 2016. DOI: 10.1007/978-3-319-48881-3 52

744

G.M.Y.E. Brouwers et al.

a perfect perspective transformation (a matrix), which is described by intrinsic and extrinsic parameters of the camera. The intrinsic parameters are: pixel skew, principal point location, focal length and aspect ratio of the pixels. The extrinsic parameters describe the orientation and position of the camera with respect to a world coordinate system by a rotation and a translation. The process of finding the model parameters that best describe the mapping of scene onto the image plane of the camera is called camera calibration. The golden standard for camera calibration [5] uses a pr