Symbol Spotting in Digital Libraries Focused Retrieval over Graphic-

The specific problem of symbol recognition in graphical documents requires additional techniques to those developed for character recognition. The most well-known obstacle is the so-called Sayre paradox: Correct recognition requires good segmentation, yet

  • PDF / 7,776,387 Bytes
  • 183 Pages / 466.407 x 683.296 pts Page_size
  • 54 Downloads / 170 Views

DOWNLOAD

REPORT


ymbol Spotting in Digital Libraries Focused Retrieval over Graphic-rich Document Collections

123

Symbol Spotting in Digital Libraries

Marçal Rusiñol  Josep Lladós

Symbol Spotting in Digital Libraries Focused Retrieval over Graphic-rich Document Collections

Foreword by Karl Tombre

Marçal Rusiñol Departament de Ciències de la Computació Centre de Visió per Computador Universitat Autònoma de Barcelona Edifici O, Campus UAB 08193 Bellaterra, Spain [email protected]

Josep Lladós Departament de Ciències de la Computació Centre de Visió per Computador Universitat Autònoma de Barcelona Edifici O, Campus UAB 08193 Bellaterra, Spain [email protected]

ISBN 978-1-84996-207-0 e-ISBN 978-1-84996-208-7 DOI 10.1007/978-1-84996-208-7 Springer London Dordrecht Heidelberg New York British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library Library of Congress Control Number: 2010927492 © Springer-Verlag London Limited 2010 Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms of licenses issued by the Copyright Licensing Agency. Enquiries concerning reproduction outside those terms should be sent to the publishers. The use of registered names, trademarks, etc., in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant laws and regulations and therefore free for general use. The publisher makes no representation, express or implied, with regard to the accuracy of the information contained in this book and cannot accept any legal responsibility or liability for any errors or omissions that may be made. Cover design: KünkelLopka GmbH, Heidelberg Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)

Foreword

Pattern recognition basically deals with the recognition of patterns, shapes, objects, things in images. Document image analysis was one of the very first applications of pattern recognition and even of computing. But until the 1980s, research in this field was mainly dealing with text-based documents, including OCR (Optical Character Recognition) and page layout analysis. Only a few people were looking at more specific documents such as music sheet, bank cheques or forms. The community of graphics recognition became visible in the late 1980s. Their specific interest was to recognize high-level objects represented by line drawings and graphics. The specific pattern recognition problems they had to deal with was raster-to-graphics conversion (i.e., recognizing graphical primitives in a cluttered pixel image), text-graphics separation, and symbol recognition. The specific problem of symbol recognition in graphical documents has received a