Understanding contents of filled-in Bangla form images

  • PDF / 8,286,035 Bytes
  • 42 Pages / 439.37 x 666.142 pts Page_size
  • 52 Downloads / 173 Views

DOWNLOAD

REPORT


Understanding contents of filled-in Bangla form images Rajdeep Bhattacharya, et al. [full author details at the end of the article] Received: 14 October 2019 / Revised: 20 July 2020 / Accepted: 27 August 2020 # Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract

With a wide variety of forms being generated in different organizations daily, efficient and quick retrieval of information from these forms becomes a pressing need. The data on these forms are imperative to any commercial or professional purpose and thus, efficient retrieval of this data is important for further processing of the same. An automatic form processing system retrieves the content of a filled-in form image for useful storage of the same. Despite a large population of the world speaking in Bangla, to the best of our knowledge, there is no significant research work found in literature which deals with form data written in Bangla. To bridge this research gap, in the present scope of the work, we have developed a system that addresses four important aspects of processing of form data written using Bangla script. Our work has primarily been divided into four major modules: touching component separation, text nontext separation, handwritten printed text separation and alphabet numeral separation. The vital problem of touching component separation has been addressed using a novel rule-based method. For text non-text separation, handwritten printed text separation and alphabet numeral separation, we have used a machine learning based approach using feature engineering where the model for each case has been finalized after exhaustive experiments. Further, in each of the last three modules, we have applied some new features along with some existing features to appropriately tune the modules to obtain optimum results. Notably, we have also prepared a self-made database of filled-in forms. To create different training models, first the filled-in form images are binarized, and then different types of components are colored uniquely to obtain images which act as the ground truth for our reference. Evaluation of modules on the said database produces reasonably satisfactory results considering the complexity of the research problem. The code along with some filled-in sample form images and their respective ground truth images are provided in the link https://github.com/rajdeep-cse17/Form_Processing. Keywords Form processing . Text non-text separation . Touching component separation . Alphanumeric separation . Bangla script

1 Introduction Form is a specific category of documents which is used as an essential means of information collection in various sectors of our society on a regular basis that include Banks, Railways,

Multimedia Tools and Applications

Schools and other Offices. These forms can be paperback as well as digital [1]. Though certain sectors have started using digital forms, they are far from becoming the norm and even in the modern era of digitization, production and use of forms as hard copy is a very common and con