A Chinese expert disambiguation method based on semi-supervised graph clustering

PDF / 476,353 Bytes
8 Pages / 595.276 x 790.866 pts Page_size
60 Downloads / 208 Views

ORIGINAL ARTICLE

A Chinese expert disambiguation method based on semi-supervised graph clustering Jin Jiang • Xin Yan • Zhengtao Yu Jianyi Guo • Wei Tian

•

Received: 26 October 2013 / Accepted: 9 April 2014 Ó Springer-Verlag Berlin Heidelberg 2014

Abstract In order to utilize the associated relationship in the expert page efficiently, we’d like to introduce a Chinese expert disambiguation method based on the semi-supervised graph clustering with the integration of various associated relationships. Firstly, extract the correlation characteristics of the expert attributes according to the correlation analysis on the expert page. Secondly, construct a similarity matrix between the documents on different expert pages with the utilization of the attributes characteristics and the associated relationship of the expert pages. Finally, with the adoption of the attribute correlation as the semi-supervised constraint, construct an expert disambiguation model by applying the graph-based clustering approach to get the solution of the model through the kernel-based method for the purpose to achieve expert name disambiguation. Through the contrast experiment in the Chinese expert disambiguation, it turns out that the disambiguation effect is much better with the adoption of the semi-supervised graph clustering method that has been integrated with the expert-associated relationships. Keywords Chinese expert name disambiguation Semi-supervised graph clustering Associated relationship

J. Jiang X. Yan Z. Yu (&) J. Guo W. Tian School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China e-mail: [email protected] J. Jiang X. Yan Z. Yu J. Guo W. Tian Key Lab of Intelligent Information Processing, Kunming University of Science and Technology, Kunming 650500, China

1 Introduction Now name repetition has occurred everywhere on the internet, which has made the expert name disambiguation become imperative. At present, in terms of the research on expert disambiguation approaches, the main idea about disambiguation is still to transform the problem of expert name disambiguation into a clustering problem. At present the main disambiguation methods include the followings: the first kind is a clustering disambiguation method based on feature vectors similarity, for example, Wang [1] proposed to use the vector space model of web content to do expert evidence-pages clustering disambiguation to solve the multidocument conference resolution problem. The second is a clustering disambiguation method based on attribute similarity, for example, Cohen [2] achieved expert clustering disambiguation by calculating the similarity of attribute. The third is a clustering disambiguation method based on the specific relationships. For example, Lang [3] presented a name disambiguation approach based on social networks, building social networks by exploiting the evidence-page titles name co-occurrence relationships in the context fragment, and used clustering to realize disambiguatio

Data Loading...

A Chinese expert disambiguation method based on semi-supervised graph clustering

Recommend Documents

A Novel Graph Partitioning Criterion Based Short Text Clustering Method

Author Name Disambiguation Based on Rule and Graph Model

Graph-based Clustering

Personal Name Disambiguation in Web Search Results Based on a Semi-supervised Clustering Approach

A Spectral Clustering Algorithm Based on Hierarchical Method

A Network Attack Recognition Method Based on Probability Target Graph

A Network Traffic Classification Method Based on Hierarchical Clustering

A Joint Model for Graph-Based Chinese Dependency Parsing

Research on Clustering Identification Method Based on Path Sampling in Support Vector Clustering

Dynamic clustering method for imbalanced learning based on AdaBoost

Imbalanced Data Classification Method Based on Clustering and Voting Mechanism

Multi-stage Hierarchical Clustering Method Based on Hypergraph