A Cascaded Approach for Keyframes Extraction from Videos

Keyframes extraction, a fundamental problem in video processing and analysis, has remained a challenge to date. In this paper, we introduce a novel method to effectively extract keyframes of a video. It consists of four steps. At first, we generate initia

  • PDF / 1,641,448 Bytes
  • 9 Pages / 439.37 x 666.142 pts Page_size
  • 0 Downloads / 185 Views

DOWNLOAD

REPORT


Abstract. Keyframes extraction, a fundamental problem in video processing and analysis, has remained a challenge to date. In this paper, we introduce a novel method to effectively extract keyframes of a video. It consists of four steps. At first, we generate initial clips for the classified frames, based on consistent content within a clip. Using empirical evidence, we design an adaptive window length for the frame difference processing which outputs the initial keyframes then. We further remove the frames with meaningless information (e.g., black screen) in initial clips and initial keyframes. To achieve satisfactory keyframes, we finally map the current keyframes to the space of current clips and optimize the keyframes based on similarity. Extensive experiments show that our method outperforms to state-of-the-art keyframe extraction techniques with an average of 96.84% on precision and 81.55% on F1 . Keywords: Keyframe extraction classification · Video retrieval

1

· Frame difference · Image

Introduction

Keyframes extraction, that is extracting keyframes from a video, is a fundamental problem in video processing and analysis. It has a lot of application fields like video coding, so it is important to design robust and effective keyframe extraction methods. Current methods are usually based on either pixel matrix or deep learning classification results [5,6,9,10,18,19]. However, they still suffer from some limitations. More specifically, the keyframe extraction techniques based on pixel matrix are not capable of achieving decent accuracies, for example, when c Springer Nature Switzerland AG 2020  F. Tian et al. (Eds.): CASA 2020, CCIS 1300, pp. 73–81, 2020. https://doi.org/10.1007/978-3-030-63426-1_8

74

Y. Pei et al.

handling news videos [16]. Nevertheless, the involved keyframes extraction could take a considerable amount of time [15]. Motivated by the above issues, we propose a novel keyframe extraction approach in this paper. Given an input video, we first turn it into frames and perform classification with available deep learning networks. The classified frames are split into initial clips, each of which has consistent content. We then design an adaptive window length for frame difference processing which takes the computed initial clips as input and outputs. Also, we remove the frames with meaningless information for previous results, such as black screen. Eventually, to obtain desired keyframes, we map the current keyframes to the space of the current clips and optimize the keyframes based on similarity. Our method is simple yet effective. It is elegantly built on top of deep learning classification and the frame difference processing. Experiments validate our approach and demonstrate that it outperforms or is comparable to state-of-the-art keyframe extraction techniques. The main contributions of this paper are: – a novel robust keyframe extraction approach that fits various types of videos; – the design of the adaptive window length and the removal of meaningless frames; – a mapping scheme and an optimization method on