From Text Detection to Text Segmentation: A Unified Evaluation Scheme
Current text segmentation evaluation protocols are often incapable of properly handling different scenarios (broken/merged/partial characters). This leads to scores that incorrectly reflect the segmentation accuracy. In this article we propose a new evalu
- PDF / 1,905,600 Bytes
- 17 Pages / 439.37 x 666.142 pts Page_size
- 102 Downloads / 244 Views
EPITA-LRDE, 14-16, rue Voltaire, 94276 Le Kremlin Bicˆetre, France {calarasanu,jonathan.fabrizio}@lrde.epita.fr Sorbonne Universit´es, UPMC Univ Paris 06, CNRS, UMR 7222, ISIR, 75005 Paris, France [email protected]
Abstract. Current text segmentation evaluation protocols are often incapable of properly handling different scenarios (broken/merged/ partial characters). This leads to scores that incorrectly reflect the segmentation accuracy. In this article we propose a new evaluation scheme that overcomes most of the existent drawbacks by extending the EvaLTex protocol (initially designed to evaluate text detection at region level). This new unified platform has numerous advantages: it is able to evaluate a text understanding system at every detection stage and granularity level (paragraph/line/word and now character) by using the same metrics and matching rules; it is robust to all segmentation scenarios; it provides a qualitative and quantitative evaluation and a visual score representation that captures the whole behavior of a segmentation algorithm. Experimental results on nine segmentation algorithms using different evaluation frameworks are also provided to emphasize the interest of our method.
Keywords: Evaluation segmentation
1
protocol
·
Evaluation
metrics
·
Text
Introduction
During the last decade, text understanding systems have received a lot of attentions from both the research and the industry communities. In particular, endto-end systems became popular due to their complex processing chain. Such systems rely on different processing stages, such as text segmentation, text grouping, text classification, text localization, text rectification or text recognition. Among these stages, text segmentation is a phase of crucial importance not only for many end-to-end systems but also for other applications that rely on it, such as the image inpaiting (e.g. subtitle removal [21]). The interest in this stage is also reflected by the organization of numerous competitions around this topic, such as ICDAR 2013 [15] and 2015 [13] Robust Reading Competition (Task 2) with Challenge 1 on born-digital images and Challenge 2 on natural scenes. c Springer International Publishing Switzerland 2016 G. Hua and H. J´ egou (Eds.): ECCV 2016 Workshops, Part I, LNCS 9913, pp. 378–394, 2016. DOI: 10.1007/978-3-319-46604-0 28
From Text Detection to Text Segmentation: A Unified Evaluation Scheme
379
The different stages of an end-to-end system are evaluated in different ways, with respect to the type of text representation at each level. For example, text segmentation implies evaluating a binary representation, while text localization is usually done by comparing bounding box positions in an image. Finally, the text recognition is evaluated based on the transcription obtained by using an OCR. Frequently, the evaluation of the transcription is also used as a final result for many end-to-end systems. This result provides a combined evaluation based on the recognition accuracy of the used OCR and the localization preci
Data Loading...