Robust Adaptation to Non-Native Accents in Automatic Speech Recognition

Speech recognition technology is being increasingly employed in human-machine interfaces. A remaining problem however is the robustness of this technology to non-native accents, which still cause considerable difficulties for current systems. In this book

PDF / 2,605,132 Bytes
135 Pages / 442.662 x 700.745 pts Page_size
115 Downloads / 302 Views

DOWNLOAD

REPORT

Lecture Notes in Computer Science Edited by G. Goos, J. Hartmanis, and J. van Leeuwen

2560

3

Berlin Heidelberg New York Barcelona Hong Kong London Milan Paris Tokyo

Silke Goronzy

Robust Adaptation to Non-Native Accents in Automatic Speech Recognition

13

Series Editors Jaime G. Carbonell, Carnegie Mellon University, Pittsburgh, PA, USA J¨org Siekmann, University of Saarland, Saarbr¨ucken, Germany Author Silke Goronzy Sony International (Europe) GmbH, SCLE, MMI Lab Heinrich-Hertz-Straße 1, 70327 Stuttgart, Germany E-mail: [email protected]

Cataloging-in-Publication Data applied for A catalog record for this book is available from the Library of Congress. Bibliographic information published by Die Deutsche Bibliothek. Die Deutsche Bibliothek lists this publication in the Deutsche Nationalbibliograﬁe; detailed bibliographic data is available in the Internet at .

CR Subject Classiﬁcation (1998): I.2.7, I.2, J.5, H.5.2, F.4.2 ISSN 0302-9743 ISBN 3-540-00325-8 Springer-Verlag Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, speciﬁcally the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microﬁlms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law. Springer-Verlag Berlin Heidelberg New York, a member of BertelsmannSpringer Science+Business Media GmbH http://www.springer.de © Springer-Verlag Berlin Heidelberg 2002 Printed in Germany Typesetting: Camera-ready by author, data conversion by Boller Mediendesign Printed on acid-free paper SPIN: 10871801 06/3142 543210

Preface

Speech recognition technology is being increasingly employed in humanmachine interfaces. Two of the key problems aﬀecting such technology, however, are its robustness across diﬀerent speakers and robustness to non-native accents, both of which still create considerable diﬃculties for current systems. In this book methods to overcome these problems are described. A speaker adaptation algorithm that is based on Maximum Likelihood Linear Regression (MLLR) and that is capable of adapting the acoustic models to the current speaker with just a few words of speaker speciﬁc data is developed and combined with conﬁdence measures that focus on phone durations as well as on acoustic features to yield a semi-supervised adaptation approach. Furthermore, a speciﬁc pronunciation modelling technique that allows the automatic derivation of non-native pronunciations without using non-native data is described and combined with the conﬁdence measures and speaker adaptation techniques to produce a robust adaptation to non-native accents in an automatic speech recognition system. The aim of this book is to present the state of th

Data Loading...

Robust Adaptation to Non-Native Accents in Automatic Speech Recognition

Recommend Documents

Experiments on Automatic Recognition of Nonnative Arabic Speech

Automatic speech recognition: a survey

Automatic Speech Recognition of Galo

Isolated Word Automatic Speech Recognition System

Advanced Comb Filtering for Robust Speech Recognition

Robust In-Car Speech Recognition Based on Nonlinear Multiple Regressions

A Review of Signal Subspace Speech Enhancement and Its Application to Noise Robust Speech Recognition

Federated Acoustic Model Optimization for Automatic Speech Recognition

Annotation of Heterogeneous Multimedia Content Using Automatic Speech Recognition

A Robust Multimodal Speech Recognition Method using Optical Flow Analysis

Automatic Speech Recognition of Arabic Phonemes with Neural Networks

Toward Lexicon-Free Bangla Automatic Speech Recognition System