New Era for Robust Speech Recognition Exploiting Deep Learning

This book covers the state-of-the-art in deep neural-network-based methods for noise robustness in distant speech recognition applications. It provides insights and detailed descriptions of some of the new concepts and key technologies in the field, inclu

PDF / 9,255,402 Bytes
433 Pages / 439.42 x 683.15 pts Page_size
27 Downloads / 377 Views

DOWNLOAD

REPORT

Era for Robust Speech Recognition Exploiting Deep Learning

New Era for Robust Speech Recognition

Shinji Watanabe • Marc Delcroix • Florian Metze • John R. Hershey Editors

New Era for Robust Speech Recognition Exploiting Deep Learning

123

Editors Shinji Watanabe Mitsubishi Electric Research Laboratories (MERL) Cambridge Massachusetts, USA

Marc Delcroix NTT Communication Science Laboratories NTT Corporation Kyoto, Japan

Florian Metze Language Technologies Institute Carnegie Mellon University Pittsburgh Pennsylvania, USA

John R. Hershey Mitsubishi Electric Research Laboratories (MERL) Cambridge Massachusetts, USA

ISBN 978-3-319-64679-4 DOI 10.1007/978-3-319-64680-0

ISBN 978-3-319-64680-0 (eBook)

Library of Congress Control Number: 2017955274 © Springer International Publishing AG 2017 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Printed on acid-free paper This Springer imprint is published by Springer Nature The registered company is Springer International Publishing AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

This book is dedicated to the memory of Yajie Miao ( 2016), who tragically passed away during the preparation of this book.

Preface

The field of automatic speech recognition has evolved greatly since the introduction of deep learning, which began only about 5 years ago. In particular, as more and more products using speech recognition are being deployed, there is a crucial need for increased noise robustness, which is well served by deep learning methods. This book covers the state of the art in noise robustness for deep-neural-networkbased speech recognition with a focus on applications to distant speech. Some of the main actors in the areas of front-end and back-end research on noise-robust speech recognition research gathered in Seattle for the 2015 Jelinek Speech and Language Summer Workshop. The

Data Loading...

New Era for Robust Speech Recognition Exploiting Deep Learning

Recommend Documents

Deep Learning for NLP and Speech Recognition

Robust Pose Recognition Using Deep Learning

Deep Residual Local Feature Learning for Speech Emotion Recognition

Advanced Comb Filtering for Robust Speech Recognition

Scene Text Detection and Recognition: The Deep Learning Era

Pattern recognition and features selection for speech emotion recognition model using deep learning

Speech and Facial Based Emotion Recognition Using Deep Learning Approaches

Emotion Recognition in Speech with Deep Learning Architectures

Audio Adversarial Examples for Robust Hybrid CTC/Attention Speech Recognition

Performance of deer hunting optimization based deep learning algorithm for speech emotion recognition

Feature Learning via Deep Belief Network for Chinese Speech Emotion Recognition

Robust In-Car Speech Recognition Based on Nonlinear Multiple Regressions