Accurate Scene Text Recognition Based on Recurrent Neural Network

Scene text recognition is a useful but very challenging task due to uncontrolled condition of text in natural scenes. This paper presents a novel approach to recognize text in scene images. In the proposed technique, a word image is first converted into a

PDF / 2,261,499 Bytes
14 Pages / 439.37 x 666.142 pts Page_size
91 Downloads / 236 Views

DOWNLOAD

REPORT

Abstract. Scene text recognition is a useful but very challenging task due to uncontrolled condition of text in natural scenes. This paper presents a novel approach to recognize text in scene images. In the proposed technique, a word image is ﬁrst converted into a sequential column vectors based on Histogram of Oriented Gradient (HOG). The Recurrent Neural Network (RNN) is then adapted to classify the sequential feature vectors into the corresponding word. Compared with most of the existing methods that follow a bottom-up approach to form words by grouping the recognized characters, our proposed method is able to recognize the whole word images without character-level segmentation and recognition. Experiments on a number of publicly available datasets show that the proposed method outperforms the state-of-the-art techniques signiﬁcantly. In addition, the recognition results on publicly available datasets provide a good benchmark for the future research in this area.

1

Introduction

Reading text in scenes is a very challenging task in Computer Vision, which has been drawing increasing research interest in recent years. This is partially due to the rapid development of wearable and mobile devices such as smart phones, digital cameras, and the latest google glass, where scene text recognition is a key module to a wide range of practical and useful applications. Traditional Optical Character Recognition (OCR) systems usually assume that the document text has well deﬁned text fonts, size, layout, etc. and scanned under well-controlled lighting. They often fail to recognize camera-captured texts in scenes, which could have little constraints in terms of text fonts, environmental lighting, image background, etc., as illustrated in Fig. 1. Intensive research eﬀorts have been observed in this area in recent years and a number of good scene text recognition systems have been proposed. One approach is to combine text segmentation with existing OCR engines, where text pixels are ﬁrst segmented from the image background and then fed to OCR engines for recognition. Several systems have been reported that exploit Markov Random Field [5], Nonlinear color enhancement [6] and Inverse Rendering [7] to extract the character regions. However, the text segmentation process by itself is a very challenging task that is prone to diﬀerent types of segmentation errors. c Springer International Publishing Switzerland 2015 D. Cremers et al. (Eds.): ACCV 2014, Part I, LNCS 9003, pp. 35–48, 2015. DOI: 10.1007/978-3-319-16865-4 3

36

B. Su and S. Lu

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

(j)

(k)

(l)

(m)

(n)

(o)

(p)

Fig. 1. Word image examples taken from the recent Public Datasets [1–4]. All the words in the images are correctly recognized by our proposed method.

Furthermore, the OCR engines may fail to recognize the segmented texts due to special text fonts and perspective distortion, as the OCR engines are usually trained on characters with fronto-parallel view and normal fonts. A number of scene text recognition techniques

Data Loading...

Accurate Scene Text Recognition Based on Recurrent Neural Network

Recommend Documents

Scene Text Recognition Based on Deep Learning

Bay Number Recognition Based on Deep Convolutional Recurrent Neural Network

Text-Based Handwritten Recognition Through an Image Using Recurrent Neural Network

EEG-based emotion recognition using 4D convolutional recurrent neural network

Sequential Deformation for Accurate Scene Text Detection

Natural Scene Mongolian Text Detection Based on Convolutional Neural Network and MSER

Emotion Recognition in Sentences - A Recurrent Neural Network Approach

Review on Text Recognition in Natural Scene Images

Air Quality Index Prediction Based on Deep Recurrent Neural Network

Urban Scene Recognition via Deep Network Integration

A Novel Isolated Speech Recognition Method Based on Neural Network

Target Recognition of RCS Sequence Based on Neural Network