3D-2D deep convolutional neural network (DCNN) Cascade for robust video face identification

PDF / 2,178,600 Bytes
14 Pages / 439.37 x 666.142 pts Page_size
67 Downloads / 228 Views

3D-2D deep convolutional neural network (DCNN) Cascade for robust video face identification Kyeong Tae Kim 1 & Bumshik Lee 2 & Jae Young Choi 1 Received: 16 February 2020 / Revised: 19 July 2020 / Accepted: 29 July 2020 # Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract

This paper proposes a novel video face identification method, named “3D-2D-DCNN cascade” that serially combines 3D and 2D deep convolutional neural networks (DCNNs) for robust video face recognition (FR). In our method, an input video (face) sequence is first divided into a number of sub-video sequences and each of the sub-video sequences is then used as an input to the 3D-DCNN, aiming to obtain a set of class-confidence scores for a given input video sequence. These class-confidence scores are aggregated in a novel way, resulting in the formation of our novel class-confidence matrix. Key characteristic of our method is to make use of this class-confidence matrix for fine-tuning 2D-DCNN, which is serially linked to 3D-DCNN, to obtain the final face identification results. To verify the proposed method, two popular video identification benchmarks, COX Face and YTC databases, were used. Compared to the best reported recognition results on these two benchmarks, our proposed method achieves better or comparable recognition performances. Keywords Video face identification . Deep convolutional neural network . 3D-2D-DCNN cascade . Class-confidence matrix

1 Introduction Video face identification has received a significant interest, due to a wide range of applications such as video surveillance, biometric identification, and content-based video indexing/search [28]. Recent trends in video face identification is the use of deep learning based methods, and

* Jae Young Choi [email protected]

1

Pattern Recognition and Machine Intelligence Laboratory, Division of Computer & Electronic Systems Engineering, Hankuk University of Foreign Studies, 81, Oedae-ro, Mohyeon-myeon, Cheoin-gu, Yongin-si, Gyeonggi-do 17305, Republic of Korea

2

Department of Information and Communications Engineering, Chosun University, 61452 Gwangju, Republic of Korea

Multimedia Tools and Applications

especially deep convolutional neural networks (DCNNs) show promising results [4, 24, 25, 31]. There have been a few studies on deep learning based video face identification. Here, several representative studies using deep learning on face identification are introduced. The authors in [17] presented a framework for detecting the image operator chain based on the Convolutional Neural Network (CNN) to determine whether an image has undergone a specific manipulation, and the sequence of manipulation. Authors in [31] proposed a Neural Aggregation Network (NAN) that generates a “fused” feature vector by combining multiple deep feature vectors, each computed for a particular video frame, using adaptive weights. However, if the weight of the high quality face image is too high, the effect of low quality face image is significantly suppressed, which leads to distort

Data Loading...

3D-2D deep convolutional neural network (DCNN) Cascade for robust video face identification

Recommend Documents

DCNN-IDS: Deep Convolutional Neural Network Based Intrusion Detection System

Robust Speaking Face Identification for Video Analysis

Deep Convolutional Neural Network for Real and Fake Face Discrimination

Dual adaptive deep convolutional neural network for video forgery detection in 3D lighting environment

A robust video zero-watermarking based on deep convolutional neural network and self-organizing map in polar complex exp

A Hybrid Convolutional Neural Network for Complex Leaves Identification

Gated Siamese Convolutional Neural Network Architecture for Human Re-identification

Deep Convolutional Neural Network for Microseismic Signal Detection and Classification

Deep Convolutional Neural Network for Remote Sensing Scene Classification

Deep Convolutional Neural Network Architectures for Tonal Frequency Identification in a Lofargram

AestheticNet: deep convolutional neural network for person identification from visual aesthetic

Fast and Robust Compression of Deep Convolutional Neural Networks