Toward perfect neural cascading architecture for grammatical error correction
- PDF / 1,989,538 Bytes
- 14 Pages / 595.224 x 790.955 pts Page_size
- 16 Downloads / 159 Views
Toward perfect neural cascading architecture for grammatical error correction Kingsley Nketia Acheampong1
· Wenhong Tian1
Accepted: 25 September 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract Grammatical Error Correction (GEC) is the task of correcting several diverse errors in a text such as spelling, punctuation, morphological, and word choice typos or mistakes. Expressed as a sentence correction task, models such as neuralbased sequence-to-sequence (seq2seq) GECs have emerged to offer solutions to the task. However, neural-based seq2seq grammatical error correction models are computationally expensive both in training and in translation inference. Also, they tend to suffer from poor generalization and arrive at inept capabilities due to limited error-corrected data, and thus, incapable of effectively correcting grammar. In this work, we propose the use of Neural Cascading Architecture and different techniques in enhancing the effectiveness of neural sequence-to-sequence grammatical error correction models as inspired by post-editing processes of Neural Machine Translations (NMTs). The findings of our experiments show that, in low-resource NMT models, adapting the presented cascading techniques unleashes performances that is comparable to high setting NMT models, with improvements on state-of-the-art (SOTA) JHU FLuency- Extended GUG corpus (JFLEG) parallel corpus for developing and evaluating GEC model systems. We extensively exploit and evaluate multiple cascading learning strategies and establish best practices toward improving neural seq2seq GECs. Keywords Grammatical error correction · Neural machine translation · Machine translation · Natural language processing
1 Introduction The application of neural machine translation (NMT) models for the task of grammatical error correction (GEC) is now trendy, meriting an upsurge of research efforts towards improving NMT models, most especially neural sequence-to-sequence (seq2seq) models. GEC is developed as a sentence correction task where a typical GEC system receives an erroneous input sentence and outputs the corrected form of the sentence. Presently, GEC systems depend heavily on diverse datasets in order to produce
Wenhong Tian
tian [email protected] Kingsley Nketia Acheampong [email protected] 1
Information Intelligence Technology Laboratory, School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, China
suitable corrections. Eventually, several datasets, both public and non-public, have also been made available to support the development of GEC systems (Fig. 1). The neural network takes the form of two components: an encoder network and a decoder network. The encoder network is responsible for encoding the input sentence into a sequence of distributed representations, whereas the decoder network uses the sequence of distributed representations to generate a translation with an attention model [1, 19]. Over the years, NMT models based on this encod
Data Loading...