Predictive Segmentation Using Multichannel Neural Networks in Arabic OCR System

This article offers an open vocabulary Arabic text recognition system using two neural networks, one for segmentation and another one for characters recognition. The problem of words segmentation in Arabic language, like many cursive languages, presents a

  • PDF / 1,921,422 Bytes
  • 13 Pages / 439.37 x 666.142 pts Page_size
  • 46 Downloads / 217 Views

DOWNLOAD

REPORT


Abstract. This article offers an open vocabulary Arabic text recognition system using two neural networks, one for segmentation and another one for characters recognition. The problem of words segmentation in Arabic language, like many cursive languages, presents a challenge to the OCR systems. This paper presents a multichannel neural network to solve offline segmentation of machine-printed Arabic documents. The segmented characters are then used as input to a convolutional neural network for Arabic characters recognition. The accuracy of the segmentation model using one font is 98.9 %, while four-font model showed 95.5 % accuracy. The accuracy of characters recognition on Arabic Transparent font of size 18 pt from APTI data set is 94.8 %.

Keywords: Arabic segmentation networks

1

·

OCR

·

Convolutional neural

Introduction

In the classic topic of Arabic characters recognition, we are concerned about digitizing Arabic documents into electronic format. Since Arabic is cursive, so the range of research in the topic can be classified according to how the system recognizes words or sub-words. In [1–4], word level features are recognized to classify them into a word in a vocabulary set. On the other hand [5,6], recognize characters features by using a preprocessing step to segment the input word, then the segmented characters are recognized by a character recognition model. While [7,8], use a sliding window to recognize characters features. This paper offers an open-vocabulary Arabic text recognition system using two neural networks, one for the segmentation and another one for characters recognition. Automatic segmentation of Arabic has always been a tough problem to solve [9]. Unlike Latin languages which are cursive mostly in handwritten text, Arabic and Farsi are cursive by nature, so typesetting is cursive in both machine generated and handwritten text. Segmentation of words to their constituting characters is a crucial step to the succeeding recognition phase. An Arabic character can have up to four different shapes according to its placement in the word: isolated, start, middle, and end (Fig. 1a). Some characters may only differ c Springer International Publishing AG 2016  F. Schwenker et al. (Eds.): ANNPR 2016, LNAI 9896, pp. 233–245, 2016. DOI: 10.1007/978-3-319-46182-3 20

234

M.A. Radwan et al.

(a)

(b)

(c)

Fig. 1. (a) Different shapes of two characters. (b–c) Horizontal and vertical ligatures.

in the number of diacritics. Characters also have different heights and widths. Defined combinations of certain characters can have special ligatures to connect them (Fig. 1b and c). Due to these characteristics, segmentation algorithms can fall short by over-segmentation of wide characters, or under-segmentation of interleaving characters. Many algorithms devised for segmentation of cursive Arabic documents, made use of the structural pattern of lower pixels density between characters. Based on this pattern, a histogram of horizontal projection has been widely used for segmentation [10,11]. However, this method is