Optimization of a Tracking System Based on a Network of Cameras

  • PDF / 941,393 Bytes
  • 15 Pages / 612 x 792 pts (letter) Page_size
  • 95 Downloads / 154 Views

DOWNLOAD

REPORT


FICIAL INTELLIGENCE

Optimization of a Tracking System Based on a Network of Cameras V. V. Chigrinskiia,* and I. A. Matveevb,** a

b

Moscow Institute of Physics and Technology, Dolgoprudnyi, Moscow oblast, 141700 Russia Federal Research Center “Computer Science and Control,” Russian Academy of Sciences, Moscow, 119333 Russia *e-mail: [email protected] **e-mail: [email protected] Received February 10, 2020; revised February 29, 2020; accepted March 30, 2020

Abstract—Tracking the motion of objects in video sequences is an important problem of computer vision that has a wide range of applications. The key points in tracking systems is the detection of an object and, if it was detected repeatedly, its reidentification. A fast correctly working tracking system that uses a number of cameras is described. The system includes detection and segmentation of objects in images, construction of their appearance descriptors, comparison of each new object with earlier collected objects, and making a decision about their reidentification. The basic system configuration is implemented in which the state-of-the art detection algorithms and models for constructing the appearance descriptors are used as the constituent parts. Based on this, the system as a whole and some of its modules are modified. A computational experiment that quantitatively confirms the advantages of the modified system over the basic system is performed. DOI: 10.1134/S1064230720040127

INTRODUCTION First, we give some definitions and briefly describe the principle of the system’s operation. By multicamera tracking we mean tracking an object using the data obtained from multiple surveillance video cameras. The video streams produced by the cameras are processed frame-by-frame by a detector, which is a subsystem that determines the presence and location of objects of interest in the full images. The result of detecting an object (the set of its characteristics) is called detection. As a new detection is found, the tracker—a system that tracks the detected object until it disappears from the field of view—gets to work. The set of an object detections arranged in time is called its track. After a track has been formed, the sequence of detections is transformed into a vector description of the object using a model (neural network or classical). This vector description is called the object’s descriptor. The descriptors of processed objects are stored in a database, which we will call gallery, and the current object under examination, after its descriptor has been constructed, is called a query. As a new query occurs, its descriptor is compared with all descriptors in the gallery, and based on this comparison it is assigned a unique number called its identifier or ID; the ID determines the membership of the query in a certain class. For example, in the case of tracking people, the identifier corresponds to the human personality; for this reason, each unique object is called a person. The comparison of a query with the gallery and assignment of an ID to the