Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural Networks

PDF / 3,670,625 Bytes
14 Pages / 439.642 x 666.49 pts Page_size
31 Downloads / 175 Views

Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural Networks Junho Jo2 · Hyung Il Koo3 · Jae Woong Soh2 · Nam Ik Cho1,2 Received: 5 December 2019 / Revised: 10 August 2020 / Accepted: 13 August 2020 / © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract We present a method that separates handwritten and machine-printed components that are mixed and overlapped in documents. Many conventional methods addressed this problem by extracting connected components (CCs) and classifying the extracted CCs into two classes. They were based on the assumption that two types of components are not overlapping each other, while we are focusing on more challenging and realistic cases where the components are often overlapping each other. For this, we propose a new method that performs pixel-level classification with a convolutional neural network. Unlike conventional neural network methods, our method works in an end-to-end manner and does not require any preprocessing steps (e.g., foreground extraction, handcrafted feature extraction, and so on). For the training of our network, we develop a cross-entropy based loss function to alleviate the class imbalance problem. Regarding the training dataset, although there are some datasets of mixed printed characters and handwritten scripts, most of them do not have overlapping cases and do not provide pixel-level annotations. Hence, we also propose a data synthesis method that generates realistic pixel-level training samples having many overlappings of printed and handwritten components. Experimental results on synthetic and real images have shown the effectiveness of the proposed method. Although the proposed network has been trained only with synthetic images, it also improves the OCR rate of real documents. Specifically, the OCR rate for machine-printed texts is increased from 0.8087 to 0.9442 by removing the overlapped handwritten scribbles by our method. Keywords Handwritten text segmentation · Text separation · Data synthesis · Class imbalance problem · Optical character recognition

Nam Ik Cho

[email protected] 1

Dept. of Electrical and Computer Engineering, Seoul National University, Gwanak-ro 1, Gwanak-Gu, Seoul 08826, Korea

2

Dept. of Electrical and Computer Eng., INMC, Seoul National University, Seoul, Korea

3

Dept. of Electrical and Computer Engineering, Ajou University, Suwon, Korea

Multimedia Tools and Applications

1 Introduction Document digitization has been an essential topic for the decades, and a huge number of methods have been proposed to address many sub-tasks such as optical character recognition (OCR) [22], text-line segmentation [8, 19], layout analysis [1, 3], and so on. Therefore there have been many research and developments in machine-printed document understanding and handwritten text recognition, achieving human-comparable performances in well-controlled environments [22]. However, the understanding of mixed cases (i.e., documents having handwritten and machine-printed components on the same page) st

Data Loading...

Handwritten Text Segmentation via End-to-End Learning of Convolutional Neural Networks

Recommend Documents

Recognizing handwritten digits with convolutional neural networks

Entity-Based Short Text Classification Using Convolutional Neural Networks

Text-Convolutional Neural Networks for Fake News Detection in Tweets

Fetal Brain Segmentation Using Convolutional Neural Networks with Fusion Strategies

Convolutional Neural Networks

Biomedical Text Recognition Using Convolutional Neural Networks: Content Based Deep Learning

Automatic MR Spinal Cord Segmentation by Hybrid Residual Attention-Aware Convolutional Neural Networks and Learning Rate

Relay Backpropagation for Effective Learning of Deep Convolutional Neural Networks

Pruning Deep Convolutional Neural Networks via Gradient Support Pursuit

Biologically Plausible Learning of Text Representation with Spiking Neural Networks

Advanced Applied Deep Learning Convolutional Neural Networks and Ob

Comparing Incremental Learning Strategies for Convolutional Neural Networks