Fast and robust key frame extraction method for gesture video based on high-level feature representation

PDF / 1,659,989 Bytes
10 Pages / 595.276 x 790.866 pts Page_size
47 Downloads / 240 Views

ORIGINAL PAPER

Fast and robust key frame extraction method for gesture video based on high-level feature representation Huimin Yang1 · Qiuhong Tian1

· Qiaoli Zhuang1 · Linye Li1 · Qinglong Liang1

Received: 15 March 2020 / Revised: 12 August 2020 / Accepted: 11 September 2020 © Springer-Verlag London Ltd., part of Springer Nature 2020

Abstract In gesture video, the inner-frame difference is too subtle to be projected via low-level features, and the gesture frames, expressing semantic information, are distributed only among the tiny part of the whole video frame. This paper introduces a fast and robust key frame extraction method for gesture video, founded upon high-level feature representation to extract the gesture key frame precisely without affecting the semantic information. Firstly, a gesture video segmentation model is designed by employing SSD, which classify gesture video into the semantic scene and the static scene. And then, the 2DDWT-based perceptual hash algorithm is studied to extract candidate static key frames. Afterward, the multi-channel gradient magnitude frequency histogram (HGMF-MC) based on improved VGG16 is developed as a new image descriptor. Finally, a key frame extraction mechanism based on HGMF-MC is proposed to generate gesture video summary of two scenes, respectively. Experiments consistently show the superiority of the proposed method on Chinese sign language, Cambridge, ChaLearn and CVRR-Hands gesture datasets. The results demonstrate that the method proposed is effective, which improves the video compression ratio and outperforms the state-of-the-art methods. Keywords Gesture video classification · Improved VGG16 · The histogram of gradient magnitude frequency · 2D-DWT-based perceptual hash

1 Introduction Gesture recognition (GR) is playing a dominant role in human–computer interaction, and dynamic gesture recognition can better meet the real-time needs of human–computer interaction. A dynamic gesture video can often be converted into hundreds of video frames; analyzing and processing such an amount of data is a complex and time-consuming task. Key frame extraction of gesture video decreases the amount of processing data, and it can improve the real-time performance of the gesture recognition algorithm. Therefore, key frame extraction is an effective method to generate a video summary. The extracted key frames should represent the sequence information of the whole gesture video without missing important content. At the same time, the extracted key frames are not similar to each other.

B

Qiuhong Tian [email protected] Huimin Yang [email protected]

1

Zhejiang Sci-Tech university, Hangzhou, China

At present, key frame extraction algorithms based on video can be categorized into four classes: clustering based, motion information based, video segmentation based and deep learning based. Clustering is a widely used key frame extraction approach [1]. Video frames are clustered according to similarity, and the frame, which is closest to each cluster center, is selected as the key

Data Loading...

Fast and robust key frame extraction method for gesture video based on high-level feature representation

Recommend Documents

Face Detection-based Video Key Frame Extraction

Robust detection of video text using an efficient hybrid method via key frame extraction and text localization

Wavelet frame-based feature extraction technique for improving classification accuracy

Deep Learning Video Action Recognition Method Based on Key Frame Algorithm

Erratum to: An innovative algorithm for key frame extraction in video summarization

An Improved Hand Gesture Recognition System Based on Optimized MSVM and SIFT Feature Extraction Algorithm

RST-Resilient Video Watermarking Using Scene-Based Feature Extraction

Robust and Accurate Method for Textual Information Extraction Over Video Frames

Fast and Straightforward Feature Selection Method

A local feature extraction method for UAV-based image registration based on virtual line descriptors

Research on sports video retrieval algorithm based on semantic feature extraction

Feature Extraction for Content-Based Image Retrieval