Language identification from multi-lingual scene text images: a CNN based classifier ensemble approach

  • PDF / 1,798,735 Bytes
  • 12 Pages / 595.276 x 790.866 pts Page_size
  • 44 Downloads / 292 Views

DOWNLOAD

REPORT


ORIGINAL RESEARCH

Language identification from multi‑lingual scene text images: a CNN based classifier ensemble approach Neelotpal Chakraborty1 · Soumyadeep Kundu1 · Sayantan Paul1 · Ayatullah Faruk Mollah2 · Subhadip Basu1 · Ram Sarkar1 Received: 19 February 2020 / Accepted: 5 September 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract Since the past two decades, detecting text regions in complex natural images has emerged as a problem of great interest for the research fraternity. This is because these regions of interest serve as source of information that can be utilized for various purposes. However, these regions may contain texts in multiple languages. Hence, identifying the corresponding language of a detected scene text becomes important for further information processing. Language identification of the text, captured in a wild, is an extremely challenging research field in the domain of scene text recognition. In this paper, a deep learningbased classifier combination approach is proposed to solve the problem of language identification from multi-lingual scene text images. In this work, a minimalist Convolutional Neural Network architecture is used as the base model. Five variants of an input image—three different channels of RGB color model (i.e. R for red, G for green and B for blue) along with RGB itself, and grayscale image are passed through the base model separately. The outcomes of these five models are combined using the classifier combination approaches based on sum rule and product rule. Performances of the proposed model have been evaluated on some standard datasets like KAIST and MLe2e as well as in-house multi-lingual scent text dataset. From the experimental results, it has been observed that the proposed model outperforms some state-of-the-art methods considered here for comparison. Keywords  Multi-lingual · Language identification · Scene text · CNN · Classifier combination

1 Introduction The domain of multi-lingual scene text analysis (Zhu et al. 2016) involves detecting texts in natural scene images and identifying the language of the detected text for further processing. This research area is of great interest within the research fraternity owing to potential applications in several areas like tour guide and assistance, information retrieval, language translation, and image to text conversion (Lin et al. 2019), etc. This wide scope of applications to satiate growing human needs thus necessitates the development of some robust techniques to detect texts in complex natural images (Saidane and Garcia 2007). Furthermore, a * Neelotpal Chakraborty [email protected] 1



Computer Science and Engineering Department, Jadavpur University, Kolkata, India



Computer Science and Engineering Department, Aliah University, Kolkata, India

2

culturally diverse country like India where languages vary from region to region throws another challenge of depicting texts in multiple languages. People of one region may not know the languages of other regions. This