Information Retrieval Architecture and Algorithms

This text presents a theoretical and practical examination of the latest developments in Information Retrieval and their application to existing systems. By starting with a functional discussion of what is needed for an information system, the reader can

  • PDF / 5,072,182 Bytes
  • 312 Pages / 439.37 x 666.142 pts Page_size
  • 82 Downloads / 276 Views

DOWNLOAD

REPORT


Gerald Kowalski

Information Retrieval Architecture and Algorithms

1  3

Gerald Kowalski Ashburn, VA, USA

ISBN 978-1-4419-7715-1     e-ISBN 978-1-4419-7716-8 DOI 10.1007/978-1-4419-7716-8 Springer New York Dordrecht Heidelberg London © Springer Science+Business Media, LLC 2011 All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)

This book is dedicated to my grandchildren, Adeline, Bennet, Mollie Kate and Riley who are the future  Jerry Kowalski

Preface

Information Retrieval has radically changed over the last 25  years. When I first started teaching Information Retrieval and developing large Information Retrieval systems in the 1980s it was easy to cover the area in a single semester course. Most of the discussion was theoretical with testing done on small databases and only a small subset of the theory was able to be implemented in commercial systems. There were not massive amounts of data in the right digital format for search. Since 2000, the field of Information retrieval has undergone a major transformation driven by massive amounts of new data (e.g., Internet, Facebook, etc.) that needs to be searched, new hardware technologies that makes the storage and processing of data feasible along with software architecture changes that provides the scalability to handle massive data sets. In addition, the area of information retrieval of multimedia, in particular images, audio and video, are part of everyone’s information world and users are looking for information retrieval of them as well as the traditional text. In the textual domain, languages other than English are becoming far more prevalent on the Internet. To understand how to solve the information retrieval problems is no longer focused on search algorithm improvements. Now that Information Retrieval Systems are commercially available, like the area of Data Base Management Systems, an Information Retrieval System approach is needed to understand how to provide the search and retrieval capabilities needed by users. To understand modern information retrieval it’s necessary to understand search and retrieval for both text and multimedia formats. Although search algorithms are important, other aspects of the total system such as pre-processing on ingest of data and how to display the search res