Hybrid embedding and joint training of stacked encoder for opinion question machine reading comprehension

PDF / 620,670 Bytes
10 Pages / 595.276 x 841.89 pts (A4) Page_size
75 Downloads / 208 Views

1346

2020 21(9):1346-1355

Frontiers of Information Technology & Electronic Engineering www.jzus.zju.edu.cn; engineering.cae.cn; www.springerlink.com ISSN 2095-9184 (print); ISSN 2095-9230 (online) E-mail: [email protected]

Hybrid embedding and joint training of stacked encoder for opinion question machine reading comprehension∗ Xiang-zhou HUANG† , Si-liang TANG† , Yin ZHANG†‡ , Bao-gang WEI∧ College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China † E-mail:

[email protected]; [email protected]; [email protected]

Received Oct. 19, 2019; Revision accepted Mar. 16, 2020; Crosschecked Aug. 10, 2020

Abstract: Opinion question machine reading comprehension (MRC) requires a machine to answer questions by analyzing corresponding passages. Compared with traditional MRC tasks where the answer to every question is a segment of text in corresponding passages, opinion question MRC is more challenging because the answer to an opinion question may not appear in corresponding passages but needs to be deduced from multiple sentences. In this study, a novel framework based on neural networks is proposed to address such problems, in which a new hybrid embedding training method combining text features is used. Furthermore, extra attention and output layers which generate auxiliary losses are introduced to jointly train the stacked recurrent neural networks. To deal with imbalance of the dataset, irrelevancy of question and passage is used for data augmentation. Experimental results show that the proposed method achieves state-of-the-art performance. We are the biweekly champion in the opinion question MRC task in Artificial Intelligence Challenger 2018 (AIC2018). Key words: Machine reading comprehension; Neural networks; Joint training; Data augmentation https://doi.org/10.1631/FITEE.1900571 CLC number: TP391.1

1 Introduction Artiﬁcial intelligence (AI) has experienced over 60 years of continuous development and changed the world (Pan, 2016). Teaching machines to read and comprehend is a vital part of AI. Machine reading comprehension (MRC) is a task of answering questions by understanding corresponding passages. It is a key goal in natural language processing (NLP). The MRC task has attracted a lot of attention in recent years and there are already several largescale datasets released, including MCTest (Richardson et al., 2013), CNN/Daily Mail (Hermann et al., ‡ ∧

Corresponding author

Deceased Project supported by the China Knowledge Centre for Engineering Sciences and Technology (No. CKCEST-2019-1-12) and the National Natural Science Foundation of China (No. 61572434) ORCID: Xiang-zhou HUANG, https://orcid.org/0000-00018870-4341; Yin ZHANG, https://orcid.org/0000-0001-6986-4227 c Zhejiang University and Springer-Verlag GmbH Germany, part of Springer Nature 2020

*

2015), WikiQA (Yang et al., 2015), Stanford Question Answering dataset (SQuAD) (Rajpurkar et al., 2016), Microsoft MAchine Reading COmprehension dataset (MS-MARCO) (Bajaj et al., 2016), TriviaQA (Joshi et al., 2017), and DuR

Data Loading...

Hybrid embedding and joint training of stacked encoder for opinion question machine reading comprehension

Recommend Documents

A Densely Connected Encoder Stack Approach for Multi-type Legal Machine Reading Comprehension

Automatic Question Generation System for English Reading Comprehension

The Effects of Suprasegmental Phonological Training on English Reading Comprehension: Evidence from Chinese EFL Learners

Reading Comprehension Assisting Children with Learning Difficulties

A Framework for Classifying Temporal Relations with Question Encoder

Reading Comprehension in Czech via Machine Translation and Cross-Lingual Transfer

Exploring Artificial Jabbering for Automatic Text Comprehension Question Generation

Oral reading fluency, reading motivation and reading comprehension among second graders

TextCaps: A Dataset for Image Captioning with Reading Comprehension

Transformer Fault Diagnosis Based on Stacked Contractive Auto-Encoder Net

Teaching Reading Comprehension in Portuguese Primary and Middle Schools

Research on Cross-lingual Machine Reading Comprehension Technology Based on Non-parallel Corpus