IICE: Web Tool for Automatic Identification of Chemical Entities and Interactions

Automatic methods are being developed and applied to transform textual biomedical information into machine-readable formats. Machine learning techniques have been a prominent approach to this problem. However, there is still a lack of systems that are eas

PDF / 188,226 Bytes
4 Pages / 439.37 x 666.142 pts Page_size
89 Downloads / 272 Views

DOWNLOAD

REPORT

Faculdade de Ciˆencias, BioISI: Biosystems & Integrative Sciences Institute, Universidade de Lisboa, Lisboa, Portugal 2 LaSIGE, Departamento de Inform´ atica, Faculdade de Ciˆencias, Universidade de Lisboa, 1749-016 Lisboa, Portugal [email protected], [email protected], [email protected]

Abstract. Automatic methods are being developed and applied to transform textual biomedical information into machine-readable formats. Machine learning techniques have been a prominent approach to this problem. However, there is still a lack of systems that are easily accessible to users. For this reason, we developed a web tool to facilitate the access to our text mining framework, IICE (Identifying Interactions between Chemical Entities). This tool annotates the input text with chemical entities and identifies the interactions described between these entities. Various options are available, which can be manipulated to control the algorithms employed by the framework and to the output formats. Keywords: Text mining · Machine learning · Ontologies · Named entity recognition · Relation extraction

1

Introduction

The amount of information about chemical compounds that is published in the form of scientiﬁc literature is growing at an unprecedented rate [1]. To update the chemical interactions described in databases, such as DrugBank [4] and IntAct [3], relies on manual reading and parsing the literature. This means that this update will always lag behind scientiﬁc publications, as experts extract the relevant information from the papers. For this reason, there is a growing need for automatic methods that transform biomedical text into machine-readable structured data, such as an interaction between compounds. Information extraction systems applied to the biomedical domain have been developed and are available to the community [5]. However, their performance depends on the machine used by the user, usually requiring external libraries and speciﬁc installation instructions. A more practical solution is releasing the system as a web tool, with a front-end enabling any user to test and experiment with it. We developed the IICE framework (Identifying Interactions between Chemical Entities), for automatic annotation of biomedical documents. IICE is based on c Springer International Publishing Switzerland 2015 A. Bifet et al. (Eds.): ECML PKDD 2015, Part III, LNAI 9286, pp. 285–288, 2015. DOI: 10.1007/978-3-319-23461-8 31

286

A. Lamurias et al.

supervised machine learning algorithms and semantic similarity between ontology concepts. We have evaluated the framework with the CHEMDNER [7] dataset, for the recognition of chemical entities, and with the DDIExtraction dataset [8], for extraction of drug-drug interactions. The F-measure obtained for each dataset was of 78.26% and 72.52%, respectively, which can be considered nearly state-ofthe-art. The IICE framework can be accessed by a web tool1 , with several conﬁguration options available to the user. These options enable the user to obtain diﬀerent results by adjusting the

Data Loading...

IICE: Web Tool for Automatic Identification of Chemical Entities and Interactions

Recommend Documents

Comp4Text Checker: An Automatic and Visual Evaluation Tool to Check the Readability of Spanish Web Pages

Web Tool for the Identification of Industrial Symbioses in Industrial Parks

A Tool for Web Usage Mining

Information Security for Automatic Speaker Identification

Fully-Automatic Web Data Extraction

Co-Web: A Tool for Collaborative Web Searching for Pre-Teens and Teens

Automatic Identification of Account Sharing for Video Streaming Services

Automatic identification of atypical clinical fMRI results

Automatic Summarization of Web Page Based on Statistics and Structure

Superclusteroid 2.0: A Web Tool for Processing Big Biological Networks

Fully automated web-based tool for identifying regulatory hotspots

Automatic Human Gender Identification Using Palmprint