Computational Method for Recognizing Situations and Objects in the Frames of a Continuous Video Stream Using Deep Neural

PDF / 3,389,643 Bytes
16 Pages / 612 x 792 pts (letter) Page_size
83 Downloads / 168 Views

ERN RECOGNITION AND IMAGE PROCESSING

Computational Method for Recognizing Situations and Objects in the Frames of a Continuous Video Stream Using Deep Neural Networks for Access Control Systems O. S. Amosova,*, S. G. Amosovaa, S. V. Zhiganovb, Yu. S. Ivanovb, and F. F. Pashchenkoa a

Institute of Control Sciences, Russian Academy of Sciences, Moscow, Russia State University, Komsomolsk-na-Amure, Russia *e-mail: [email protected]

b Komsomolsk-na-Amure

Received January 13, 2019; revised May 1, 2020; accepted May 25, 2020

Abstract—An effective (performance- and accuracy-wise) computational method for pattern recognition in a continuous video stream using deep neural networks for access control systems is proposed. The class of recognition problems solved by the method using a sequence of video stream frames is identified: the vehicle itself and the characters on its license plate (LP), faces of people, and abnormal situations. In contrast to the known solutions, a classification with a subsequent reinforcement based on multiple frames of a video stream and with an algorithm for the automatic annotation of images is used. Neural network architectures with independent recurrent layers for classifying video fragments adapted for the problems, a dual network for face recognition, and a deep neural network for vehicle character recognition are proposed. New databases for neural network training are created. A schematic diagram of an intelligent access control system for ensuring the security of an enterprise, a distinctive feature of which is the use of a multirotor unmanned aerial vehicle with a computing unit, is proposed. Field experiments are carried out, and the accuracy and performance of the computational method in solving each problem are assessed. Software modules in the Python language for solving tasks of the intelligent access control system are developed. DOI: 10.1134/S1064230720050020

INTRODUCTION Computer vision systems (CVS) are currently increasingly introduced in various domains. Patterns of physical and technical objects, as well as situations involving them, can be captured in video stream frames. The set of properties is different for each object or situation. Examples of the properties for technical objects, such as vehicles, are the type, number, color, etc., and for physical objects, such as people, the gender, age, etc. The properties of objects can be assessed by individual frames, even by one of them. Situations differ significantly from objects because the former are characterized by duration and the relationships between dynamic objects. Therefore, it is necessary to use a sequence of frames of a continuous video stream to assess the properties of situations. The main task of computer vision algorithms is to search for patterns in the image, identify their key features, which characterize the properties of objects and situations, and recognize them for further decision making or control. Recently, the use of deep neural networks (NNs) in CVS recognition problems has become one of the most nota

Data Loading...

Computational Method for Recognizing Situations and Objects in the Frames of a Continuous Video Stream Using Deep Neural

Recommend Documents

A Vision System for Recognizing Objects in Complex Real Images

Density Results for Continuous Frames

Convolution of Images Using Deep Neural Networks in the Recognition of Footage Objects

A Method for Decompensation Prediction in Emergency and Harsh Situations

A method for video categorization by analyzing text, audio, and frames

Detection and Localization of Embedded Subtitles in a Video Stream

Video Captioning for Proactive Video Management Using Deep Machine Learning

A Deep Learning Architecture for Recognizing Abnormal Activities of Groups Using Context and Motion Information

Continuous Schauder Frames for Banach Spaces

Extraction and Enhancement of Moving Objects in a Video

A Machine Learning Method for Recognizing Invasive Content in Memes

Creating Stereoscopic (3D) Video from a 2D Monocular Video Stream