A new 3D convolutional neural network (3D-CNN) framework for multimedia event detection

PDF / 2,583,826 Bytes
9 Pages / 595.276 x 790.866 pts Page_size
45 Downloads / 202 Views

ORIGINAL PAPER

A new 3D convolutional neural network (3D-CNN) framework for multimedia event detection Kaavya Kanagaraj1 · G. G. Lakshmi Priya1 Received: 2 August 2019 / Revised: 29 September 2020 / Accepted: 1 October 2020 © Springer-Verlag London Ltd., part of Springer Nature 2020

Abstract Multimedia event detection has received a great deal of interest due to developments in video technology and an increase in multimedia data. However, complexities of video content such as noisy, overlapping, repeated interaction between individuals, and various scenes are becoming difficult for characterizing the subjects and concepts. In particular, Internet users find it difficult to search for a specified event. To solve the above problem, a method is proposed that best suits for event detection, demonstrating the 3D convolutional neural network (3D-CNN) structure to accomplish promising performance in multimedia event classification. To take an advantage of motion content of the event in the video, temporal axis is considered. Both the feature extraction and classification are incorporated in this model. Experiments are carried out on the Columbia Consumer Video benchmark dataset, and results are compared with other existing works. Keywords Multimedia event detection · 3D convolutional neural network · Feature extraction · Classification · Mean average precision · Columbia consumer video

1 Introduction With the proliferation of video content, advanced technology for indexing, filtering, searching, and mining the enormous amount of videos is increasingly needed. In social networks such as YouTube and Facebook, lakhs of user-generated videos are uploaded every week. The user-generated video resolution of cell phones, tablets, iPods, etc. is lower compared to professionally filmed video. It may undergo some jerkiness, partial occlusion, more people interaction, noisy environmental condition, etc. Due to this, the presence of visual feature becomes less, resulting in lack of semantic meaning. It challenges the problem of identifying the specific event of interest in the videos. As a result, overall need remains for the technology where the video content is automatically identified. This motivates us to focus on multimedia event detection task, where specific event of interest is retrieved from the videos. Although multimedia events are complex and may include components of low level, such as human interaction, different scenes, ideas, and action, they can involve major variations

B 1

G. G. Lakshmi Priya [email protected]

in the intra-class. For example, the events such as ‘dog’ and ‘wedding reception’ involve the concepts like dog, wedding couple, ship/boat, dining, etc. Hence, by using single concept, it is difficult to interpret the individual event completely. Moreover, videos are often captured in real-world scenarios that last for seconds to few minutes in noisy environmental conditions. Because of those challenges, events are difficult to detect. The proposed work, i.e., a 3D convolutional neural network arc

Data Loading...

A new 3D convolutional neural network (3D-CNN) framework for multimedia event detection

Recommend Documents

A Convolutional Neural Network Framework for Accurate Skin Cancer Detection

Cerebral Microbleeds Detection Based on 3D Convolutional Neural Network

Event Detection with Convolutional Neural Networks for Forensic Investigation

Image Orientation Detection Using Convolutional Neural Network

Convid-Net: An Enhanced Convolutional Neural Network Framework for COVID-19 Detection from X-Ray Images

Deep Convolutional Neural Network for Microseismic Signal Detection and Classification

Dual adaptive deep convolutional neural network for video forgery detection in 3D lighting environment

Three-Stream Convolutional Neural Network for Human Fall Detection

A Malicious URL Detection Model Based on Convolutional Neural Network

DAAT: A New Method to Train Convolutional Neural Network on Atrial Fibrillation Detection

Dynamic Hand Gesture Recognition Using 3D-Convolutional Neural Network

White Blood Cells Detection and Classification Using Convolutional Neural Network