Linking chemical and disease entities to ontologies by integrating PageRank with extracted relations from literature

PDF / 1,667,997 Bytes
11 Pages / 595.276 x 790.866 pts Page_size
68 Downloads / 250 Views

Journal of Cheminformatics Open Access

RESEARCH ARTICLE

Linking chemical and disease entities to ontologies by integrating PageRank with extracted relations from literature Pedro Ruas* , Andre Lamurias and Francisco M. Couto

Abstract Background: Named Entity Linking systems are a powerful aid to the manual curation of digital libraries, which is getting increasingly costly and inefficient due to the information overload. Models based on the Personalized PageRank (PPR) algorithm are one of the state-of-the-art approaches, but these have low performance when the disambiguation graphs are sparse. Findings: This work proposes a Named Entity Linking framework designated by Relation Extraction for Entity Linking (REEL) that uses automatically extracted relations to overcome this limitation. Our method builds a disambiguation graph, where the nodes are the ontology candidates for the entities and the edges are added according to the relations established in the text, which the method extracts automatically. The PPR algorithm and the information content of each ontology are then applied to choose the candidate for each entity that maximises the coherence of the disambiguation graph. We evaluated the method on three gold standards: the subset of the CRAFT corpus with ChEBI annotations (CRAFT-ChEBI), the subset of the BC5CDR corpus with disease annotations from the MEDIC vocabulary (BC5CDR-Diseases) and the subset with chemical annotations from the CTD-Chemical vocabulary (BC5CDR-Chemicals). The F1-Score achieved by REEL was 85.8%, 80.9% and 90.3% in these gold standards, respectively, outperforming baseline approaches. Conclusions: We demonstrated that RE tools can improve Named Entity Linking by capturing semantic information expressed in text missing in Knowledge Bases and use it to improve the disambiguation graph of Named Entity Linking models. REEL can be adapted to any text mining pipeline and potentially to any domain, as long as there is an ontology or other knowledge Base available. Keywords: Named Entity Linking, Relation extraction, PageRank, Ontologies, Text mining Introduction Background

There has been an intense growth in the amount of scientific literature available, mainly in the form of scientific articles, whose content is mostly expressed in natural language. For instance, there are more than 30 million articles in the PubMed repository [1], which is one of *Correspondence: [email protected] LASIGE, Faculdade de Ciências, Universidade de Lisboa, 1749‑016 Lisbon, Portugal

the most used libraries in the Life Sciences and the Biomedical domains. This information overload creates problems for researchers who want to retrieve information, because they need to spend more time and effort to find the relevant articles for their work. Simultaneously, the number of online resources of biological information has also been rising, as it is the case of the domain ontologies. Domain ontologies provide a coherent representation of the knowledge in a specific scientific field, allowing a standardised nomenclature

Data Loading...

Linking chemical and disease entities to ontologies by integrating PageRank with extracted relations from literature

Recommend Documents

Linking Entities in Knowledge Transfer: The Innovation Intermediaries

Coming to Terms with FAIR Ontologies

Clinical Entities that Mimic Salivary Inflammatory Disease

PageRank Algorithm

Modeling Your Entities and Data with JPA

Gram-positive and Gram-negative Sepsis: Two Disease Entities?

PageRank Algorithm Applied to Web Graphs

Linking Literature, Information, and Knowledge for Biology Workshop

Kaemika App: Integrating Protocols and Chemical Simulation

Ontologies with Python Programming OWL 2.0 Ontologies with Pytho

Web-based interactive mapping from data dictionaries to ontologies, with an application to cancer registry

From Literature to Cultural Literacy