Information Extraction in the Web Era Natural Language Communica

The number of research topics covered in recent approaches to Information - traction (IE) is continually growing as new facts are being considered. In fact, while the user’s interest in extracting information from texts deals mainly with the success of th

  • PDF / 2,504,986 Bytes
  • 175 Pages / 430 x 660 pts Page_size
  • 19 Downloads / 150 Views

DOWNLOAD

REPORT


Subseries of Lecture Notes in Computer Science

2700

3

Berlin Heidelberg New York Hong Kong London Milan Paris Tokyo

Maria Teresa Pazienza (Ed.)

Information Extraction in the Web Era Natural Language Communication for Knowledge Acquisition and Intelligent Information Agents

13

Series Editors Jaime G. Carbonell, Carnegie Mellon University, Pittsburgh, PA, USA J¨org Siekmann, University of Saarland, Saarbr¨ucken, Germany Volume Editor Maria Teresa Pazienza AI Research Group Department of Computer Science, Systems and Production Via del Politecnico 1 00133 Roma, Italy E-mail: [email protected]

Cataloging-in-Publication Data applied for A catalog record for this book is available from the Library of Congress. Bibliographic information published by Die Deutsche Bibliothek Die Deutsche Bibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data is available in the Internet at .

CR Subject Classification (1998): I.2, H.3, H.2.8, H.4 ISSN 0302-9743 ISBN 3-540-40579-8 Springer-Verlag Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law. Springer-Verlag Berlin Heidelberg New York a member of BertelsmannSpringer Science+Business Media GmbH http://www.springer.de © Springer-Verlag Berlin Heidelberg 2003 Printed in Germany Typesetting: Camera-ready by author, data conversion by PTP-Berlin GmbH Printed on acid-free paper SPIN: 10928653 06/3142 543210

Preface

The number of research topics covered in recent approaches to Information Extraction (IE) is continually growing as new facts are being considered. In fact, while the user’s interest in extracting information from texts deals mainly with the success of the entire process of locating, in document collections, facts of interest, the process itself is dependent on several constraints (e.g. the domain, the collection dimension and location, and the document type) and currently it tackles composite scenarios, including free texts, semi- and structured texts such as Web pages, e-mails, etc. The handling of all these factors is tightly related to the continued evolution of the underlying technologies. In the last few years, in real-world applications we have seen the need for scalable, adaptable IE systems (see M.T. Pazienza, “Information Extraction: Towards Scalable Adaptable Systems”, LNAI 1714) to limit the need for human intervention in the customization process and portability of the IE application to new domains. Scalability and adaptability requirements are sti