Quad-Tree Based Image Segmentation and Feature Extraction to Recognize Online Handwritten Bangla Characters

In this paper, three different feature extraction strategies along with their all possible combinations have been discussed in detail for the recognition of online handwritten Bangla basic characters. Applying a quad-tree based image segmentation approach

  • PDF / 3,351,878 Bytes
  • 11 Pages / 439.37 x 666.142 pts Page_size
  • 101 Downloads / 188 Views

DOWNLOAD

REPORT


Future Institute of Engineering and Management, Kolkata, India [email protected], [email protected], [email protected] 2 Jadavpur University, Kolkata, India [email protected] 3 West Bengal State University, Barasat, India [email protected]

Abstract. In this paper, three different feature extraction strategies along with their all possible combinations have been discussed in detail for the recognition of online handwritten Bangla basic characters. Applying a quad-tree based image segmentation approach the target character has been dissected for the extraction of features. Out of these three techniques, one is computing area feature (using composite Simpson’s rule) while other two are extracted local (mass distribution and chord length) features. Authors have also investigated optimal depth of the quad-tree (while segmenting an image), at which classifier reveals its best performance. The current experiment has been tested on 10,000 character dataset. Sequential Minimal Optimization (SMO) produces highest recognition accuracy of 98.5 % when all three feature vectors are combined. Keywords: Online handwriting recognition  Bangla script  Quad-tree based image segmentation  Composite Simpson’s rule  Mass distribution  Chord length

1 Introduction Not only due to the increasing dependency on easily available handheld devices in the form of Smart phones, tablets, iPad, A4 Take Note, etc. which are available at reasonable cost, but also these devices now govern the human society because of their numerous applicability to make life easier. One of the interesting properties of such devices is that people can provide information freely on those devices and written information can be saved in the form of online information bearing the pixels information along trajectory path with pen up/down status. Adopting such online devices, people not only minimize the chances of mistyping that may arise when writing with a keyboard but also saves extra time that would have been required for typing the same information. In Online Handwriting the information are stored as real time coordinate data points. In contrast to that in offline Handwriting recognition, the information are © Springer International Publishing AG 2016 F. Schwenker et al. (Eds.): ANNPR 2016, LNAI 9896, pp. 246–256, 2016. DOI: 10.1007/978-3-319-46182-3_21

Quad-Tree Based Image Segmentation and Feature Extraction

247

saved as images. So, in the later one the data are prone to quality degradation leading to high noise level in the information whereas in the former one, the information can never be prone to quality issues and hence efficient in contrast to offline. These benefits, in turn, make Online Handwriting Recognition (OHR) an upcoming research domain. Though in the literature, substantial amount of research publications are available for Devanagari script [1–8], but while talking about Bangla script, this statement is not defensible. Limited number of research works in the literature silently describes the truth. Hence, researchers