A Joint Embedding Method for Entity Alignment of Knowledge Bases

We propose a model which jointly learns the embeddings of multiple knowledge bases (KBs) in a uniform vector space to align entities in KBs. Instead of using content similarity based methods, we think the structure information of KBs is also important for

PDF / 1,034,074 Bytes
12 Pages / 439.37 x 666.142 pts Page_size
84 Downloads / 346 Views

DOWNLOAD

REPORT

Abstract. We propose a model which jointly learns the embeddings of multiple knowledge bases (KBs) in a uniform vector space to align entities in KBs. Instead of using content similarity based methods, we think the structure information of KBs is also important for KB alignment. When facing the cross-linguistic or diﬀerent encoding situation, what we can leverage are only the structure information of two KBs. We utilize seed entity alignments whose embeddings are ensured the same in the joint learning process. We perform experiments on two datasets including a subset of Freebase comprising 15 thousand selected entities, and a dataset we construct from real-world large scale KBs – Freebase and DBpedia. The results show that the proposed approach which only utilize the structure information of KBs also works well. Keywords: Embeddings · Multiple knowledge bases information · Freebase · DBpedia

1

·

Structure

Introduction

As the amount of knowledge bases (KBs) accumulated rapidly on the web, the problem of how to reuse these KBs has gained more and more attention. In the real-world scenarios, many KBs describe the same entities in diﬀerent ways, because KBs are distributional heterogeneous resources created by diﬀerent individuals or organizations. For example, president Barack Hussein Obama is denoted by m.02mjmr in Freebase [3], while Barack Obama in DBpedia [2]. Aligning such same entities could help people acquire knowledge more conveniently, as they no longer need to look up multiple KBs to obtain the full information of an entity. However, knowledge base alignment is not a trivial task, and the alignment system is often complex [8,15]. Many traditional KB matching pipeline systems including [7,11,20,22] are based on content similarity calculation and propagation. There are some standard benchmark datasets from the Ontology Alignment Evaluation Initiative (OAEI), on which several alignment systems perform alignment algorithms. The datasets don’t contain many relationships and two KBs to be aligned have common relation and property strings, which can be used c Springer Nature Singapore Pte Ltd. 2016 H. Chen et al. (Eds.): CCKS 2016, CCIS 650, pp. 3–14, 2016. DOI: 10.1007/978-981-10-3168-7 1

4

Y. Hao et al.

to compute content similarity to assist instances alignment. The statistics of the author-disambiguation dataset from OAEI2015 Instance Matching are as Table 1. Think about a real case, we have an entity named m.02mjmr refering to president Barack Hussein Obama, How do we align it with the entity named Barack Obama in another KB with all of the relations and properties in two diﬀerent encoding system? When facing the cross-linguistic or diﬀerent encoding situation, what we can leverage are only the structure information of two KBs. Content information is important to KB alignment, but we think the structure information of KBs is also signiﬁcant. Based on the observation above, we create two datasets including a subset of Freebase comprising 15 thousand selected entities (FB15K) and a dataset we constru

Data Loading...

A Joint Embedding Method for Entity Alignment of Knowledge Bases

Recommend Documents

Boosting Cross-lingual Entity Alignment with Textual Embedding

Software Entity Recognition Method Based on BERT Embedding

RE-GCN: Relation Enhanced Graph Convolutional Network for Entity Alignment in Heterogeneous Knowledge Graphs

Scientific Knowledge Bases

Knowledge Bases Synthesis

SAEA: Self-Attentive Heterogeneous Sequence Learning Model for Entity Alignment

A Transformation-Based Approach for Fuzzy Knowledge Bases Engineering

QuEST: Quantized Embedding Space for Transferring Knowledge

Entity Relative Position Representation Based Multi-head Selection for Joint Entity and Relation Extraction

Research Progress of Knowledge Graph Embedding

Improvement of knowledge-based automatic slice-alignment method for cardiac magnetic resonance imaging

Question Answering When Knowledge Bases are Incomplete