A retrieval algorithm for encrypted speech based on convolutional neural network and deep hashing

  • PDF / 1,627,198 Bytes
  • 21 Pages / 439.37 x 666.142 pts Page_size
  • 71 Downloads / 250 Views

DOWNLOAD

REPORT


A retrieval algorithm for encrypted speech based on convolutional neural network and deep hashing Qiu-yu Zhang 1

1

& Yu-zhou Li & Ying-jie Hu

1

Received: 8 July 2019 / Revised: 17 July 2020 / Accepted: 27 August 2020 # Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract

In this paper, we propose a retrieval algorithm for encrypted speech based on the convolution neural network (CNN) and deep hashing. It is used to overcome the feature extraction defects of the existing content-based encrypted speech retrieval methods, and solve the problem of low retrieval accuracy caused by high dimensional and temporality of audio data. Firstly, the study encrypts the original speech by the three-dimensional chaotic encryption algorithm and uploads it to the encryption speech library in the cloud. Since CNN can well capture the basic semantic structure features of speech data, we use CNN as a feature extractor to extract deep features from Log-Mel spectrogram/MFCC. The batch normalization algorithm is introduced in the training process, which improves the speed of network fitting, reduces the training time, and improves the retrieval efficiency of the system. Secondly, the deep features extracted from CNN are combined with the hash function to construct the system hashing index table. Finally, the retrieval is implemented by the normalized Hamming distance algorithm. The experimental results show that the proposed algorithm has better discrimination, robustness to amplitude change compared with the existing methods. Meanwhile, the proposed algorithm has a high recall, precision, and retrieval efficiency after various content preserving operations. Keywords Encrypted speech retrieval . Convolutional neural network (CNN) . Deep hashing . Speech feature extraction . Batch normalization algorithm

* Qiu-yu Zhang [email protected] Yu-zhou Li [email protected] Ying-jie Hu [email protected]

1

School of Computer and Communication, Lanzhou University of Technology, Lanzhou 730050, China

Multimedia Tools and Applications

1 Introduction With the increasing popularity of multimedia acquisition equipment, the explosive growth of multimedia data represented by audio brings not only unprecedented opportunities but also security and privacy challenges. These audio data contain important and sensitive confidential content in some specific environments, such as instructions in the military field, speech evidence in litigation, conference recordings in telecommunications and finance, which are closely related to the privacy security of individuals and society. How to retrieve the required content from massive data accurately and quickly under the condition of ensuring the privacy and security of user data, which always been one of the focuses in the field of audio retrieval [6]. At present, the speech data stored in the cloud saves the local space for users, facilitates the data sharing between different clients. Meanwhile, it also brings the problems of retrieval, privacy leakage, and data insecurity [23]. I