Effective technique for the recognition of offline Arabic handwritten words using hidden Markov models

  • PDF / 1,474,478 Bytes
  • 14 Pages / 595.276 x 790.866 pts Page_size
  • 45 Downloads / 202 Views

DOWNLOAD

REPORT


ORIGINAL PAPER

Effective technique for the recognition of offline Arabic handwritten words using hidden Markov models Sherif Abdel Azeem · Hany Ahmed

Received: 29 May 2012 / Revised: 15 September 2012 / Accepted: 28 January 2013 © Springer-Verlag Berlin Heidelberg 2013

Abstract In this paper, we present a novel segmentationfree Arabic handwriting recognition system based on hidden Markov model (HMM). Two main contributions are introduced: a new technique for dividing the image into nonuniform horizontal segments to extract the features and a new technique for solving the problems of the skewing of characters by fusing multiple HMMs. Moreover, two enhancements are introduced: the pre-processing method and feature extraction using concavity space. The proposed system first pre-processes the input image by setting the thickness of the input word to three pixels and fixing the spacing between the different parts of the word. The input image is divided into constant number of nonuniform horizontal segments depending on the distribution of the foreground pixels. A set of robust features representing the gradient of the foreground pixels is extracted using sliding windows. The input image is decomposed into several images representing the vertical, horizontal, left diagonal and right diagonal edges in the image. A set of robust features representing the densities of the foreground pixels in the various edge images is extracted using sliding windows. The proposed system builds character HMM models and learns word HMM models using embedded training. Besides the vertical sliding window, two slanted sliding windows are used to extract the features. Three different HMMs are used: one for the vertical sliding window and two for the slanted windows. A fusion scheme is used to combine the three HMMs. The proposed system is very promising and outperforms all the other Arabic handwriting recognition systems reported in the literature. S. A. Azeem · H. Ahmed (B) Electronics Engineering Department, American University in Cairo (AUC), Cairo, Egypt e-mail: [email protected] S. A. Azeem e-mail: [email protected]

Keywords Arabic handwriting recognition · Effective features · Fusion · Hidden Markov models

1 Introduction A handwriting recognition system can be either online or offline. The offline handwriting is based on optical character recognition (OCR) and is usually applied on scanned documents. On the other hand, in online handwriting, the pressure is applied on digital instrument and sequence of points traced out by the pen. Offline handwriting recognition involves the automatic conversion of text in an image into letter codes which are usable within computer and textprocessing applications, and it is generally observed to be harder than online handwriting recognition. In the online case, features can be extracted from both the pen trajectory and the resulting image, whereas in the offline case only the image is available. In recent years, some research has been done on the problem of offline Arabic handwriting recognit