MINDWALC: mining interpretable, discriminative walks for classification of nodes in a knowledge graph

  • PDF / 2,874,806 Bytes
  • 15 Pages / 595 x 791 pts Page_size
  • 43 Downloads / 177 Views

DOWNLOAD

REPORT


RESEARCH

Open Access

MINDWALC: mining interpretable, discriminative walks for classification of nodes in a knowledge graph Gilles Vandewiele*

, Bram Steenwinckel, Filip De Turck and Femke Ongenae

From The 4th International Workshop on Semantics-Powered Data Analytics SEPDA 2019 Auckland, New Zealand. 27 October 2019

Abstract Background: Leveraging graphs for machine learning tasks can result in more expressive power as extra information is added to the data by explicitly encoding relations between entities. Knowledge graphs are multi-relational, directed graph representations of domain knowledge. Recently, deep learning-based techniques have been gaining a lot of popularity. They can directly process these type of graphs or learn a low-dimensional numerical representation. While it has been shown empirically that these techniques achieve excellent predictive performances, they lack interpretability. This is of vital importance in applications situated in critical domains, such as health care. Methods: We present a technique that mines interpretable walks from knowledge graphs that are very informative for a certain classification problem. The walks themselves are of a specific format to allow for the creation of data structures that result in very efficient mining. We combine this mining algorithm with three different approaches in order to classify nodes within a graph. Each of these approaches excels on different dimensions such as explainability, predictive performance and computational runtime. Results: We compare our techniques to well-known state-of-the-art black-box alternatives on four benchmark knowledge graph data sets. Results show that our three presented approaches in combination with the proposed mining algorithm are at least competitive to the black-box alternatives, even often outperforming them, while being interpretable. Conclusions: The mining of walks is an interesting alternative for node classification in knowledge graphs. Opposed to the current state-of-the-art that uses deep learning techniques, it results in inherently interpretable or transparent models without a sacrifice in terms of predictive performance. Keywords: Knowledge graphs, Data mining, Explainable AI, Decision tree, Random forest, Feature extraction Background Introduction

Graphs are data structures that are useful to represent ubiquitous phenomena, such as social networks, chemical molecules, biological protein reactions and recommendation systems. One of their strengths lies in the fact *Correspondence: [email protected] IDLab, Ghent University – imec, Technologiepark-Zwijnaarde 126, 9000 Ghent, Belgium

that they explicitly model interactions between individual units (i.e. nodes) in the form of edges [1], which enriches the data. Today, graphs are increasingly being leveraged for various machine learning tasks [2]. For example, one might recommend new friends to a user in a social network [3], predict the role of a person in a collaboration network [4], or classify the role of a protein in a biological intera