Speech Watermarking
Speech is the most important form of human communication which carries valuable information on who/what/how speaker speaks. Currently, applying speech signal for computer science is growing due to three major reasons [1 ]. First, speech is easy to be prod
- PDF / 445,358 Bytes
- 15 Pages / 439.37 x 666.142 pts Page_size
- 45 Downloads / 225 Views
Speech Watermarking
3.1 Introduction Speech is the most important form of human communication which carries valuable information on who/what/how speaker speaks. Currently, applying speech signal for computer science is growing due to three major reasons [1]. First, speech is easy to be produced, captured, and transmitted as it has a lower cost compared to image. Second, speech signal can be captured from a distance (non-invasive). Third, speech carries other types of information such as emotion, age, and gender. In recent years, communication and computer technologies are rapidly growing which allow transferring and sharing of digital speech without any limitation. Moreover, available speech editing software is able to modify just small parts of the speech signal for changing the meaning of the speech signal. In addition, speech synthesizing technology can be applied to produce the desired individual speech signal undetectable by HAS. Therefore, applying digital watermarking seems to be necessary to solve security, privacy, and protection problems. Speech watermarking as a popular and efficient is utilizing for speech signal. Recently, speech watermarking technology can contribute to other technology, e.g., VoIP [2–4], military communication to guarantee for originality [5–8], security of telephonic recording, enhancing the security of online speaker/speech recognition systems [7, 8], and ATC purpose by identifying the airplane through watermarking the VHF radio channel [9–11]. This chapter provides information about universal speech model (LPA) and some preliminary information about speech signal. Furthermore, this chapter reviews traditional approaches and related works on speech watermarking techniques to reveal the advantages and disadvantages of each technique.
© Springer Science+Business Media Singapore 2017 M.A. Nematollahi et al., Digital Watermarking, Springer Topics in Signal Processing 11, DOI 10.1007/978-981-10-2095-7_3
39
3 Speech Watermarking
40
3.2 Speech Versus Audio Unlike the audio (music) signal which has non-stationary and non-deterministic natures, each portion of the speech signal (between 20 and 30 ms) can be modeled by linear predictive analysis (LPA) due to quasi-stationary nature [12]. In addition, speech and audio signals have different structures in terms of syntactic/semantic structure, temporal structure, and spectral structure. Furthermore, the difference between audio and speech signals lies in consonants, zero-crossing rate (ZCR), energy sequences, tonal duration, excitation patterns, harmonic pattern, dominant frequency, fundamental frequency, power distribution, alternative sequence, tonality, bandwidth, perception, and production [13]. Other difference between speech and audio signals is related to energy concentration which is less than 4 kHz and limited to 8 kHz for speech signal and is extended to 20 kHz for audio signal. It must be mentioned that these differences are used in speech/audio discriminator algorithms. Generally, intangibility is more important than qualit
Data Loading...