Natural Language Processing Using Very Large Corpora

ABOUT THIS BOOK This book is intended for researchers who want to keep abreast of cur­ rent developments in corpus-based natural language processing. It is not meant as an introduction to this field; for readers who need one, several entry-level texts are

  • PDF / 29,282,504 Bytes
  • 314 Pages / 439.37 x 666.142 pts Page_size
  • 91 Downloads / 224 Views

DOWNLOAD

REPORT


Text, Speech and Language Technology VOLUME 11

Series Editors Nancy Ide, Vassar College, New York Jean Veronis, Universite de Provence and CNRS, France

Editorial Board Harald Baayen, Max Planck Institute for Psycholinguistics, The Netherlands Kenneth W. Church, AT & T Bell Labs, New Jersey, USA Judith Klavans, Columbia University, New York, USA David T. Barnard, University of Regina, Canada Dan Tufis, Romanian Academy of Sciences, Romania Joaquim Llisterri, Universitat Autonoma de Barcelona, Spain Stig Johansson, University of Oslo, Norway Joseph Mariani, LIMSI-CNRS, France

The titles published in this series are listed at the end of this volume.

Natural Language Processing Using Very Large Corpora Edited by Susan Annstrong ISSCO, University of Geneva, Switzerland

Kenneth Church AT & T Labs-Research

Pierre Isabelle Xerox Research Centre Europe, France

Sandra Manzi ISSCO, University of Geneva, Switzerland

Evelyne Tzoukermann Lucent, Bell Laboratories

and David Yarowsky Johns Hopkins University, Baltimore, Maryland, U.S.A.

SPRINGER-SCIENCE+BUSINESS MEDIA, B.V.

A c.I.P. Catalogue record for this book is available from the Library of Congress.

ISBN 978-90-481-5349-7 DOI 10.1007/978-94-017-2390-9

ISBN 978-94-017-2390-9 (eBook)

Printed on acid-free paper

All Rights Reserved © 1999 Springer Science+Business Media Dordrecht Originally published by Kluwer Academic Publishers in 1999 No part of the material protected by this copyright notice may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording or by any information storage and retrieval system, without written permission from the copyright owner.

TABLE OF CONTENTS

Introduction .................................................... Implementation and Evaluation oj a German HMM for POS Disambiguation Helmut Feldweg. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Vll

1

Improvements in Part-oj-Speech Tagging with an Application To German Helmut Schmid ............................................... 13 Unsupervised Learning oj Disambiguation Rules for Part-oj-Speech Tagging Eric Brill and Mihai Pop. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 27 Tagging French without Lexical Probabilities - Combining Linguistic Knowledge and Statistical Learning Evelyne Tzoukermann, Dragomir Radev, and William Gale.... 43 Example-Based Sense Tagging of Running Chinese Text Xiang Tong, Chang-ning Huang, and Cheng-ming Guo. . . . . . . .. 67 Disambiguating Noun Groupings with Respect to WordNet Senses Philip Resnik ................................................. 77 A Comparison of Corpus-based Techniques for Restoring Accent.s in Spanish and French Text David YarowRky .............................................. 99 Beyond Word N -Grams Fernando Pereira, Yoram Singer, and Naftali Tishby ........... 121 Statistical Augmentation of a Chinese Machine-Readable Dictionary Pascale Fung and Dekai Wu ... '" ....... , ..................... 137