From Text Detection to Text Segmentation: A Unified Evaluation Scheme

Current text segmentation evaluation protocols are often incapable of properly handling different scenarios (broken/merged/partial characters). This leads to scores that incorrectly reflect the segmentation accuracy. In this article we propose a new evalu

PDF / 1,905,600 Bytes
17 Pages / 439.37 x 666.142 pts Page_size
102 Downloads / 341 Views

DOWNLOAD

REPORT

EPITA-LRDE, 14-16, rue Voltaire, 94276 Le Kremlin Bicˆetre, France {calarasanu,jonathan.fabrizio}@lrde.epita.fr Sorbonne Universit´es, UPMC Univ Paris 06, CNRS, UMR 7222, ISIR, 75005 Paris, France [email protected]

Abstract. Current text segmentation evaluation protocols are often incapable of properly handling diﬀerent scenarios (broken/merged/ partial characters). This leads to scores that incorrectly reﬂect the segmentation accuracy. In this article we propose a new evaluation scheme that overcomes most of the existent drawbacks by extending the EvaLTex protocol (initially designed to evaluate text detection at region level). This new uniﬁed platform has numerous advantages: it is able to evaluate a text understanding system at every detection stage and granularity level (paragraph/line/word and now character) by using the same metrics and matching rules; it is robust to all segmentation scenarios; it provides a qualitative and quantitative evaluation and a visual score representation that captures the whole behavior of a segmentation algorithm. Experimental results on nine segmentation algorithms using diﬀerent evaluation frameworks are also provided to emphasize the interest of our method.

Keywords: Evaluation segmentation

1

protocol

·

Evaluation

metrics

·

Text

Introduction

During the last decade, text understanding systems have received a lot of attentions from both the research and the industry communities. In particular, endto-end systems became popular due to their complex processing chain. Such systems rely on diﬀerent processing stages, such as text segmentation, text grouping, text classiﬁcation, text localization, text rectiﬁcation or text recognition. Among these stages, text segmentation is a phase of crucial importance not only for many end-to-end systems but also for other applications that rely on it, such as the image inpaiting (e.g. subtitle removal [21]). The interest in this stage is also reﬂected by the organization of numerous competitions around this topic, such as ICDAR 2013 [15] and 2015 [13] Robust Reading Competition (Task 2) with Challenge 1 on born-digital images and Challenge 2 on natural scenes. c Springer International Publishing Switzerland 2016 G. Hua and H. J´ egou (Eds.): ECCV 2016 Workshops, Part I, LNCS 9913, pp. 378–394, 2016. DOI: 10.1007/978-3-319-46604-0 28

From Text Detection to Text Segmentation: A Uniﬁed Evaluation Scheme

379

The diﬀerent stages of an end-to-end system are evaluated in diﬀerent ways, with respect to the type of text representation at each level. For example, text segmentation implies evaluating a binary representation, while text localization is usually done by comparing bounding box positions in an image. Finally, the text recognition is evaluated based on the transcription obtained by using an OCR. Frequently, the evaluation of the transcription is also used as a ﬁnal result for many end-to-end systems. This result provides a combined evaluation based on the recognition accuracy of the used OCR and the localization preci

Data Loading...

From Text Detection to Text Segmentation: A Unified Evaluation Scheme

Recommend Documents

Text Segmentation

Text Segmentation for Document Recognition

Text Segmentation Using Context Overlap

Split and Merge: Component Based Segmentation Network for Text Detection

Automated Text Detection and Text-Line Construction in Natural Images

Text Summarization Challenge: An Evaluation Program for Text Summarization

Growth of $${\text{N}}{{{\text{d}}}_{{{\text{1}}\; - \;y}}}{\text{Eu}}_{y}^{{{\text{2}} + }}{{{\text{F}}}_{{{\text{3}}\;

Text Line Segmentation for Medieval Devnagari Manuscript

Unsupervised Information Extraction by Text Segmentation

Introduction to Text Visualization

Text-to-Speech (TTS)

Multi-oriented Text Detection from Video Using Sub-pixel Mapping