Novel Techniques for Dialectal Arabic Speech Recognition

Novel Techniques for Dialectal Arabic Speech describes approaches to improve automatic speech recognition for dialectal Arabic. Since speech resources for dialectal Arabic speech recognition are very sparse, the authors describe how existing Modern Standa

  • PDF / 1,813,960 Bytes
  • 120 Pages / 439.37 x 666.142 pts Page_size
  • 77 Downloads / 242 Views

DOWNLOAD

REPORT


Mohamed Elmahdy r Rainer Gruhn Wolfgang Minker

Novel Techniques for Dialectal Arabic Speech Recognition

r

Mohamed Elmahdy Qatar University Doha Qatar

Wolfgang Minker Institute of Information Technology University of Ulm Ulm Germany

Rainer Gruhn SVOX Deutschland GmbH Ulm Germany

ISBN 978-1-4614-1905-1 e-ISBN 978-1-4614-1906-8 DOI 10.1007/978-1-4614-1906-8 Springer New York Dordrecht Heidelberg London Library of Congress Control Number: 2012932302 © Springer Science+Business Media New York 2012 All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)

For my parents, wife, and kids

Preface

This book describes novel approaches to improve automatic speech recognition for dialectal Arabic. Since the existing dialectal Arabic speech resources, that are available for the task of training speech recognition systems, are very sparse and are lacking quality, we describe how existing Modern Standard Arabic (MSA) speech resources can be applied to dialectal Arabic speech recognition. Our assumption is that MSA is always a second language for all Arabic speakers, and in most cases we can identify the original dialect of a speaker even though he is speaking MSA. Hence, an acoustic model trained with a sufficient number of MSA speakers from different origins will implicitly model the acoustic features for the different Arabic dialects and in this case, it can be called dialect-independent acoustic modeling. In this work, Egyptian Colloquial Arabic (ECA) has been chosen as a typical Arabic dialect. ECA is the first ranked Arabic dialect in terms of number of speakers. A high quality ECA speech corpus with accurate phonetic transcriptions has been collected. MSA acoustic models were trained using news broadcast speech data. In fact, MSA and the different Arabic dialects do not share the same phoneme set. Therefore, in order to cross-lingually use MSA in dialectal Arabic speech recognition, we propose phoneme sets normalization. We have normalized the phoneme sets for MSA and ECA. After phoneme sets normalization, we have applied stateof-the-art acoustic model adaptation techniques like Maximum Likelihood Linear Regression (MLLR) and Maximum A-Posteriori (MAP) to adapt existing phonemic MSA acoustic models with a small amount of ECA speech data. Speech recognition r