Robust Adaptation to Non-Native Accents in Automatic Speech Recognition
Speech recognition technology is being increasingly employed in human-machine interfaces. A remaining problem however is the robustness of this technology to non-native accents, which still cause considerable difficulties for current systems. In this book
- PDF / 2,605,132 Bytes
- 135 Pages / 442.662 x 700.745 pts Page_size
- 115 Downloads / 259 Views
Lecture Notes in Computer Science Edited by G. Goos, J. Hartmanis, and J. van Leeuwen
2560
3
Berlin Heidelberg New York Barcelona Hong Kong London Milan Paris Tokyo
Silke Goronzy
Robust Adaptation to Non-Native Accents in Automatic Speech Recognition
13
Series Editors Jaime G. Carbonell, Carnegie Mellon University, Pittsburgh, PA, USA J¨org Siekmann, University of Saarland, Saarbr¨ucken, Germany Author Silke Goronzy Sony International (Europe) GmbH, SCLE, MMI Lab Heinrich-Hertz-Straße 1, 70327 Stuttgart, Germany E-mail: [email protected]
Cataloging-in-Publication Data applied for A catalog record for this book is available from the Library of Congress. Bibliographic information published by Die Deutsche Bibliothek. Die Deutsche Bibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data is available in the Internet at .
CR Subject Classification (1998): I.2.7, I.2, J.5, H.5.2, F.4.2 ISSN 0302-9743 ISBN 3-540-00325-8 Springer-Verlag Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law. Springer-Verlag Berlin Heidelberg New York, a member of BertelsmannSpringer Science+Business Media GmbH http://www.springer.de © Springer-Verlag Berlin Heidelberg 2002 Printed in Germany Typesetting: Camera-ready by author, data conversion by Boller Mediendesign Printed on acid-free paper SPIN: 10871801 06/3142 543210
Preface
Speech recognition technology is being increasingly employed in humanmachine interfaces. Two of the key problems affecting such technology, however, are its robustness across different speakers and robustness to non-native accents, both of which still create considerable difficulties for current systems. In this book methods to overcome these problems are described. A speaker adaptation algorithm that is based on Maximum Likelihood Linear Regression (MLLR) and that is capable of adapting the acoustic models to the current speaker with just a few words of speaker specific data is developed and combined with confidence measures that focus on phone durations as well as on acoustic features to yield a semi-supervised adaptation approach. Furthermore, a specific pronunciation modelling technique that allows the automatic derivation of non-native pronunciations without using non-native data is described and combined with the confidence measures and speaker adaptation techniques to produce a robust adaptation to non-native accents in an automatic speech recognition system. The aim of this book is to present the state of th
Data Loading...