LoG: a locally-global model for entity disambiguation
- PDF / 2,085,500 Bytes
- 23 Pages / 439.642 x 666.49 pts Page_size
- 42 Downloads / 283 Views
LoG: a locally-global model for entity disambiguation Kexuan Xin1 · Wen Hua1
· Yu Liu1 · Xiaofang Zhou1
Received: 26 December 2019 / Revised: 23 July 2020 / Accepted: 21 September 2020 / © Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract Entity disambiguation (ED) aims to link textual mentions in a document to the correct named entities in a knowledge base (KB). Although global ED models usually outperform local models by collectively linking mentions based on the topical coherence assumption, they may still incur incorrect entity assignment when a document contains multiple topics. Therefore, we propose a Locally-Global model (LoG) for ED which extracts global features locally, i.e., among a limited number of neighboring mentions, to combine the respective superiority of both models. In particular, we derive mention neighbors according to the syntactic distance on a dependency parse tree, and propose a tree connection method CoSimTC to measure the cross-tree distance between mentions. We also recognize the importance of keywords in a document for collective entity disambiguation, which reveal the central topic information of the document. Hence, we propose a keyword extraction method Sent2Word to detect keywords of each document. Furthermore, we extend the Graph Attention Network (GAT) to integrate both local and global features to produce a discriminative representation for each candidate entity. Our experimental results on six widely-adopted public datasets demonstrate better performance compared with state-of-the-art ED approaches. The high efficiency of the LoG model further verifies its feasibility in practice. Keywords Entity linking · Dependency parse tree · Cross-sentence distance · Keyword extraction · Graph attention network This article belongs to the Topical Collection: Special Issue on Web Information Systems Engineering 2019 Guest Editors: Reynold Cheng, Nikos Mamoulis, and Xin Huang Wen Hua
[email protected] Kexuan Xin [email protected] Yu Liu [email protected] Xiaofang Zhou [email protected] 1
School of Information Technology and Electrical Engineering, The University of Queensland, Brisbane, Australia
World Wide Web
1 Introduction Entity disambiguation (ED), which is also known as entity linking (EL), is one of the fundamental preprocessing tasks in Natural Language Processing (NLP), which can benefit various applications such as information retrieval, question answering, machine translation, etc. ED aims to resolve the semantic ambiguity of a mention and link it to the correct entry in a given knowledge base (KB), as illustrated in Figure 1. Generally, ED approaches can be classified into two categories: local model and global model. Local model [5, 26, 38] resolves mention ambiguity by utilizing features such as surface form and local context, while global model (also called collective entity linking) [11, 12, 15, 18] achieves better performance by finding the best alignment of all the mentions in a document to maximize topical coherence. However, collect
Data Loading...