Speech Enhancement in the STFT Domain

This work addresses this problem in the short-time Fourier transform (STFT) domain. We divide the general problem into five basic categories depending on the number of microphones being used and whether the interframe or interband correlation is considere

PDF / 2,217,206 Bytes
112 Pages / 439.37 x 666.142 pts Page_size
61 Downloads / 199 Views

DOWNLOAD

REPORT

For further volumes: http://www.springer.com/series/10059

Jacob Benesty Jingdong Chen Emanuël A. P. Habets •

Speech Enhancement in the STFT Domain

123

Prof. Dr. Jacob Benesty INRS-EMT University of Quebec de la Gauchetiere Ouest 800 Montreal, QC H5A 1K6 Canada e-mail: [email protected]

Emanuël A. P. Habets International Audio Laboratories Erlangen Univeristy of Erlangen-Nuremberg Am Wolfsmantel 33 91058 Erlangen Germany e-mail: [email protected]

Jingdong Chen Northwestern Polytechnical University Youyi West Road 127 710072 Xi’an People’s Republic of China e-mail: [email protected]

ISSN 2191-8112 ISBN 978-3-642-23249-7 DOI 10.1007/978-3-642-23250-3

e-ISSN 2191-8120 e-ISBN 978-3-642-23250-3

Springer Heidelberg Dordrecht London New York Library of Congress Control Number: 2011937831 Ó The Author(s) 2012 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Cover design: eStudio Calamar, Berlin/Figueres Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)

Contents

1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Single-Channel Speech Enhancement in the A Brief Review . . . . . . . . . . . . . . . . . . . . 1.2 Interframe Correlation . . . . . . . . . . . . . . . 1.3 Benefit of Using Multiple Microphones . . . 1.4 Interband Correlation . . . . . . . . . . . . . . . . 1.5 Organization of the Work . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . .

............ STFT Domain: ............ ............ ............ ............ ............ ............

...

1

. . . . . .

. . . . . .

. . . . . .

3 5 7 9 10 11

2

Single-Channel Speech Enhancement with a Gain . 2.1 Signal Model . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Microphone Signal Processing with a Gain. . . . 2.3 Performance Measures . . . . . . . . . . . . . . . . . . 2.3.1 Noise Reduction . . . . . . . . . . . . . . . . . 2.3.2 Speech Distortion . . . . . . . . . . . . . . . . 2.3.3 Mean-Square Error Criterion . . . . . . . . 2.4 Optimal Gains. . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Wiener. . . . . . . . . . . . . . . . . . . . . . . . 2.4.2 Tradeoff. . . . . . . . . . . . . . . . . . . . . . . 2.4.3 Maximum Signal-to-Noise Ratio . .

Data Loading...

Speech Enhancement in the STFT Domain

Recommend Documents

Fundamentals of Speech Enhancement

Speech Enhancement via EMD

Canonical Correlation Analysis in Speech Enhancement

Fractional Fourier Transform Techniques for Speech Enhancement

Speech intelligibility enhancement: a hybrid wiener approach

Blind Signal Separation with Speech Enhancement

Dual-Channel Speech Enhancement by Superdirective Beamforming

Sector-Based Detection for Hands-Free Speech Enhancement in Cars

Minimum mean square error estimator for speech enhancement in additive noise assuming Weibull speech priors and speech p

Permutation Correction in the Frequency Domain in Blind Separation of Speech Mixtures

Speech Enhancement with Natural Sounding Residual Noise Based on Connected Time-Frequency Speech Presence Regions

Advanced Feedforward-and-Feedback Decorrelation Algorithms for Speech Quality Enhancement