Handwritten Mathematical Expression Recognition via Paired Adversarial Learning
- PDF / 1,952,167 Bytes
- 16 Pages / 595.276 x 790.866 pts Page_size
- 69 Downloads / 217 Views
Handwritten Mathematical Expression Recognition via Paired Adversarial Learning Jin-Wen Wu1,2
· Fei Yin1 · Yan-Ming Zhang1 · Xu-Yao Zhang1,2 · Cheng-Lin Liu1,2,3
Received: 29 March 2019 / Accepted: 2 January 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract Recognition of handwritten mathematical expressions (MEs) is an important problem that has wide applications in practice. Handwritten ME recognition is challenging due to the variety of writing styles and ME formats. As a result, recognizers trained by optimizing the traditional supervision loss do not perform satisfactorily. To improve the robustness of the recognizer with respect to writing styles, in this work, we propose a novel paired adversarial learning method to learn semantic-invariant features. Specifically, our proposed model, named PAL-v2, consists of an attention-based recognizer and a discriminator. During training, handwritten MEs and their printed templates are fed into PAL-v2 simultaneously. The attention-based recognizer is trained to learn semantic-invariant features with the guide of the discriminator. Moreover, we adopt a convolutional decoder to alleviate the vanishing and exploding gradient problems of RNN-based decoder, and further, improve the coverage of decoding with a novel attention method. We conducted extensive experiments on the CROHME dataset to demonstrate the effectiveness of each part of the method and achieved state-of-the-art performance. Keywords Handwritten ME recognition · Paired adversarial learning · Semantic-invariant features · Convolutional decoder · Coverage of decoding
1 Introduction
Communicated by Jun-Yan Zhu, Hongsheng Li, Eli Shechtman, MingYu Liu, Jan Kautz, Antonio Torralba.
B
Jin-Wen Wu [email protected] Fei Yin [email protected] Yan-Ming Zhang [email protected] Xu-Yao Zhang [email protected] Cheng-Lin Liu [email protected]
1
National Laboratory of Pattern Recognition, Institute of Automation of Chinese Academy of Sciences, Beijing 100190, People’s Republic of China
2
School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, People’s Republic of China
3
CAS Center for Excellence of Brain Science and Intelligence Technology, Beijing 100190, People’s Republic of China
Handwritten mathematical expression recognition (HMER) has received considerable attention for its potential applications in many areas such as education, office automation and conference systems. This problem still faces a mountain of technical challenges since the images of handwritten MEs contain much more complicated two-dimensional (2D) structures and spatial relations than general images in computer vision (Aneja et al. 2018; Jaderberg et al. 2016; Krishna et al. 2017; Ordonez et al. 2016; Zhou et al. 2013). Furthermore, HMER also suffers from the writing-style variations (see an example in Fig. 1) and the scarcity of annotated data. HMER has been studied since the 1960s (Anderson 1967). Traditional approaches (Chan and Yeung 2000; Zanibbi and Bl
Data Loading...