Modified Differential Evolution for Biochemical Name Recognizer

In this paper we propose a modified differential evolution (MDE) based feature selection and ensemble learning algorithms for biochemical entity recognizer. Identification and classification of chemical entities are relatively more complex and challenging

PDF / 229,986 Bytes
12 Pages / 439.363 x 666.131 pts Page_size
2 Downloads / 234 Views

DOWNLOAD

REPORT

stract. In this paper we propose a modiﬁed diﬀerential evolution (MDE) based feature selection and ensemble learning algorithms for biochemical entity recognizer. Identiﬁcation and classiﬁcation of chemical entities are relatively more complex and challenging compared to the other related tasks. As chemical entities we focus on IUPAC and IUPAC related entities. The algorithm performs feature selection within the framework of a robust machine learning algorithm, namely Conditional Random Field. Features are identiﬁed and implemented mostly without using any domain speciﬁc knowledge and/or resources. In this paper we modify traditional diﬀerential evolution to perform two tasks, viz. determining relevant set of features as well as determining proper voting weights for constructing an ensemble. The feature selection technique produces a set of potential solutions on the ﬁnal population. We develop many models of CRF using these feature combinations. In order to further improve the performance the outputs of these classiﬁers are combined together using a classiﬁer ensemble technique based on modiﬁed DE. Our experiments with the benchmark datasets yield the recall, precision and F-measure values of 82.34%, 88.26% and 85.20%, respectively. Keywords: Modiﬁed Diﬀerential Evolution (MDE), Conditional Random Field (CRF), Feature Selection, Ensemble, Biochemical Named Entity.

1

Introduction

In recent times, information extraction has drawn huge attention to the practitioners and researchers. A large amount of online information is unorganized and a large number of data documents are added to it daily, so organizing, ﬁnding and extracting relevant information from such a huge amount of data is an important challenge in our day-to-day life. In life science publications and patents, chemical compounds like small signal molecules or other biological active chemical substances are the important entity classes. There exist many representations and nomenclatures for chemical names. Some examples are SMILES, InChI and IUPAC, out of which the ﬁrst two allow a direct structure search, but IUPAC A. Gelbukh (Ed.): CICLing 2014, Part I, LNCS 8403, pp. 225–236, 2014. c Springer-Verlag Berlin Heidelberg 2014

226

U.K. Sikdar, A. Ekbal, and S. Saha

like names are more frequent in biochemical texts. Trivial chemical names can be easily found using a dictionary-based approach and can be subsequently mapped to their corresponding structures. In contrast it is not feasible to enumerate all IUPAC like names. Automatic identiﬁcation of mentions of chemical compounds in text is of interest for a variety of reasons. This has potential application to the diﬀerent text mining tasks that include but not limited to the predictions of drug-drug/protein-protein interactions, ﬁnding relations to adverse reactions of chemical compounds and their associations to toxicological endpoints or the extraction of pathway and metabolic reaction relations. It helps in semantic search by enabling the search engine to return documents containing elements of the e

Data Loading...

Modified Differential Evolution for Biochemical Name Recognizer

Recommend Documents

An efficient modified differential evolution algorithm for solving constrained non-linear integer and mixed-integer glob

Advances in Differential Evolution

Interval Differential Evolution Algorithm

Differential Evolution in Electromagnetics

A Differential Evolution Algorithm for Contrast Optimization

Differential Evolution Algorithm

NAME

Sign Language Recognizer Using HMMs

Common Name

Scientific Name

Trade Name

Improved adaptive neuro-fuzzy inference system based on modified glowworm swarm and differential evolution optimization