A Fast LSF Search Algorithm Based on Interframe Correlation in G.723.1
- PDF / 1,051,602 Bytes
- 6 Pages / 600 x 792 pts Page_size
- 33 Downloads / 139 Views
A Fast LSF Search Algorithm Based on Interframe Correlation in G.723.1 Sameer A. Kibey Digital Signal Processing and Multimedia Group, Tata Elxsi Ltd., Whitefield Road, Hoody, Bangalore 560048, India Email: [email protected]
Jaydeep P. Kulkarni Centre for Electronics Design and Technology, Indian Institute of Science, Bangalore 560012, India Email: [email protected]
Piyush D. Sarode Honeywell Technology Solutions Labs Pvt. Ltd., Bangalore 560076, India Email: [email protected] Received 16 December 2002; Revised 15 October 2003; Recommended for Publication by Ulrich Heute We explain a time complexity reduction algorithm that improves the line spectral frequencies (LSF) search procedure on the unit circle for low bit rate speech codecs. The algorithm is based on strong interframe correlation exhibited by LSFs. The fixed point C code of ITU-T Recommendation G.723.1, which uses the “real root algorithm” was modified and the results were verified on ARM7TDMI general purpose RISC processor. The algorithm works for all test vectors provided by International Telecommunications Union-Telecommunication (ITU-T) as well as real speech. The average time reduction in the search computation was found to be approximately 20%. Keywords and phrases: line spectral frequencies, linear predictive coding, unit circle, interframe correlation, G.723.1.
1.
INTRODUCTION
The underlying assumption in most speech processing schemes including speech coding is the short-time stationarity of the speech signal [1]. Based on this assumption, the input speech is divided into frames of size 20–30 ms (typically) and each frame is processed to give a set of parameters which are defined by the source-filter model of speech production [2]. The encoding of these parameters requires lesser bits than the conventional waveform coders [2]. In this model, the combined effects of the glottis, the vocal tract, and the radiation of the lips are represented by a time-varying digital filter. The driving input (or the excitation) to the filter is modeled as either an impulse train (for voiced speech) or random noise (for unvoiced speech). In order to obtain the speech parameters, the principle of linear prediction is employed [1, 2]. By minimizing the mean squared error between the actual speech samples and the linearly predicted ones over a finite interval, a unique set of predictor coefficients can be determined. The transfer function of the time-varying filter is of the form H(z) =
G
1+
p
k=1 αk z
−k
.
(1)
Here G is the gain parameter, p is the order (typically 10) of the predictor, and αk are the coefficients of this filter. The recursive Levinson-Durbin algorithm is generally used to obtain the optimum estimates of αk coefficients in the least mean squared error sense [1, 2]. These coefficients contain the formant information and hence are very important parameters. However, for the purpose of quantization, the predictor coefficients αk , also known as linear predictive coding (LPC) parameters, are converted into a set of numbers calle
Data Loading...