An attention mechanism and multi-granularity-based Bi-LSTM model for Chinese Q&A system

  • PDF / 2,607,123 Bytes
  • 15 Pages / 595.276 x 790.866 pts Page_size
  • 38 Downloads / 168 Views

DOWNLOAD

REPORT


FOCUS

An attention mechanism and multi-granularity-based Bi-LSTM model for Chinese Q&A system Xiao-mei Yu1,2 · Wen-zhi Feng1 · Hong Wang1,2 · Qian Chu1 · Qi Chen1

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Abstract Natural language processing (NLP) is one of the key techniques in intelligent question-answering (Q&A) systems. Although recurrent neural networks and long short-term memory (LSTM) networks exhibit obvious advantages on well-known English Q&A datasets, they still suffer from several defects including indeterminateness, polysemy and the lack of changing morphology in Chinese, which results in complex NLP on large and diverse Chinese Q&A datasets. In this paper, we first analyze limitations of applying LSTM and bidirectional LSTM (Bi-LSTM) models to noisy Chinese Q&A datasets. Then, we focus on integrating attention mechanisms and multi-granularity word segmentation into Bi-LSTM and propose an attention mechanism and multi-granularity-based Bi-LSTM model (AM–Bi-LSTM) which combines the improved attention mechanism with a novel processing of multi-granularity word segmentation to handle the complex NLP in Chinese Q&A datasets. Furthermore, similarity of questions and answers is formulated to implement the quantitative computation which helps to achieve better performance in Chinese Q&A systems. Finally, we verify the proposed model on a noisy Chinese Q&A dataset. The experimental results demonstrate that the novel AM–Bi-LSTM model achieves significant improvement on evaluation metrics of accuracy, mean average precision and so on. Moreover, the experimental results indicate that the novel AM–Bi-LSTM model outperforms baseline methods and other LSTM-based models. Keywords NLP · Artificial intelligence · Long short-term memory · Question-answering system

1 Introduction The research on natural language processing (NLP) plays an important role in many fields (Almomani et al. 2018; Communicated by B. B. Gupta.

B

Xiao-mei Yu [email protected] Wen-zhi Feng [email protected] Hong Wang [email protected] Qian Chu [email protected] Qi Chen [email protected]

1

School of Information Science and Engineering, Shandong Normal University, Jinan 250014, People’s Republic of China

2

Shandong Provincial Key Laboratory for Distributed Computer Software Novel Technology, Jinan 250014, People’s Republic of China

Negi et al. 2013; Dkaich et al. 2017; Demin et al. 2019). In industry, NLP is involved in human–computer interaction, business information analysis and web software development (Zheng et al. 2016; Cheng et al. 2019). In academia, NLP is always known by the name of “Computational Linguistics” and is essential in fields from humanities computing and corpus linguistics to computer science and artificial intelligence (Chang et al. 2016; Li et al. 2017). As one of the core domains in artificial intelligence, the research on natural language processing has been rapidly progressed, and the related theories and methods have been deployed in a variety of new language applications (Bird et al. 2009;