Entity set expansion in knowledge graph: a heterogeneous information network perspective

  • PDF / 933,870 Bytes
  • 12 Pages / 612.284 x 802.205 pts Page_size
  • 42 Downloads / 156 Views

DOWNLOAD

REPORT


Entity set expansion in knowledge graph: a heterogeneous information network perspective Chuan SHI1 , Jiayu DING1 , Xiaohuan CAO1, Linmei HU 2

1

, Bin WU1 , Xiaoli LI2

1 School of Computer Science, Beijing University of Posts and Telecommunications, Beijing 100876, China Institute for Infocomm Research, Agency for Science, Technology and Research, Singapore 138632, Singapore

c Higher Education Press 2020 

Abstract Entity set expansion (ESE) aims to expand an entity seed set to obtain more entities which have common properties. ESE is important for many applications such as dictionary construction and query suggestion. Traditional ESE methods relied heavily on the text and Web information of entities. Recently, some ESE methods employed knowledge graphs (KGs) to extend entities. However, they failed to effectively and efficiently utilize the rich semantics contained in a KG and ignored the text information of entities in Wikipedia. In this paper, we model a KG as a heterogeneous information network (HIN) containing multiple types of objects and relations. Fine-grained multi-type meta paths are proposed to capture the hidden relation among seed entities in a KG and thus to retrieve candidate entities. Then we rank the entities according to the meta path based structural similarity. Furthermore, to utilize the text description of entities in Wikipedia, we propose an extended model CoMeSE++ which combines both structural information revealed by a KG and text information in Wikipedia for ESE. Extensive experiments on real-world datasets demonstrate that our model achieves better performance by combining structural and textual information of entities. Keywords entity set expansion, knowledge graph, heterogeneous information network, multi-type meta path

1

Introduction

Entity set expansion (ESE) aims to find out more entities belonging to a particular class using few seed entities. For example, given a few seed entities like “New York”, “Washington, D.C” and “Chicago” (cities in America), ESE is to discover the hidden relationship between the entities and get more entities of the same class, such as “Houston” and “Los Angeles”. ESE is useful for many applications such as dictionary construction [1], word sense disambiguation [2], query refinement [3] and query suggestion [4]. A lot of works have been done to solve ESE problem. Most of them focused on the text or Web resources for inferring the inherent relationship between seed entities [5–8]. They discovered distributional information or context pattern of seeds from Received July 3, 2019; accepted December 24, 2019 E-mail: [email protected]

text to expand entity set. For example, Wang and Cohen [6] proposed SEAL which automatically finds semi-structured documents with “lists” of items in any language to expand entity set. These methods failed to accurately capture the hidden relation of seed entities. Recently, Knowledge Graphs (KGs) such as Yago and DBpedia have been exploited for ESE. Qi et al. [9, 10] introduced the semantic knowledge from Wikipedia to redu