Machine Learning Configurations for Enhanced Human Protein Function Prediction Accuracy

Molecular class prediction of a protein is highly relevant for conducting research in domains of disease-detection and drug discovery process. Numerous approaches are incorporated to increase the accuracy of Human protein Function (HPF) prediction task, b

PDF / 363,380 Bytes
11 Pages / 439.37 x 666.142 pts Page_size
55 Downloads / 382 Views

DOWNLOAD

REPORT

Abstract Molecular class prediction of a protein is highly relevant for conducting research in domains of disease-detection and drug discovery process. Numerous approaches are incorporated to increase the accuracy of Human protein Function (HPF) prediction task, but it is highly challenging due to wide and versatile nature of this domain. This research is focused on sequence derived attributes/features (SDF) approach for HPF prediction and critically analyzed with the WEKA data analysis tool. New SDFs were identiﬁed and included in the training dataset from the Human protein reference database, enhanced as in number of sequences and the related features for deriving the relation with various protein classes. A range of Machine Learning approaches were analyzed for prediction effectiveness and a comprehensive comparison is carried out to achieve higher classiﬁcation accuracy. The Machine Learning approach is also analyzed for its limitation on application of broad spectrum data domain and remedies for the limitation were also explored by changing the conﬁguration of data sets and prediction classes.

Keywords Bagging Bayes Net C5 Decision tree HPF IBK J48 Logistic approach PART Random forest SDF Weka

1 Introduction Protein classiﬁcation is a vast domain with enormous amount of data available for research and analysis yet the knowledge about its correct perception is very limited. On the other hand Machine learning (ML) provides promising answers to not-so-clearly deﬁned areas of research. Thus, it’s a powerful tool to explore the possibilities of the enhancement of the current understanding of protein. A. Singh S. Sharma (&) G. Singh R. Singh Guru Nanak Dev University, Amritsar, India e-mail: [email protected] A. Singh e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2019 A. K. Luhach et al. (eds.), Smart Computational Strategies: Theoretical and Practical Aspects, https://doi.org/10.1007/978-981-13-6295-8_4

37

38

A. Singh et al.

Decision tree [1, 2] based prediction approach of machine learning is very clear and reliable for protein classiﬁcation. Being a white-box approach it clearly illustrates the sequence of computations involved at each and every stage. This plus point enables its usage by computational experts even without much knowledge of the concerned domain. Similarly, a domain expert is empowered for examining the toute followed by an expert of computation. So the gap between technical knowledge and domain expertise. Nodes and edges indicates various utilities at the different stages of computations in a Decision tree [3]. A decision tree neatly depicts the results required or outputs of various possibilities of outcome. It clearly deﬁnes the problem structure and its interpretations in a hierarchical way which is much easier to comprehend. As the model has a unique ability of considering different initial parameters and reaching a goal [4, 5]. However, recent advancements suggests that the prediction of Protein-Function is a domain

Data Loading...

Machine Learning Configurations for Enhanced Human Protein Function Prediction Accuracy

Recommend Documents

Protein Function Prediction for Omics Era

Human body skin temperature prediction based on machine learning

Machine Learning for Microbial Phenotype Prediction

Efficient Method for Prediction Accuracy of Heart Diseases Using Machine Learning

Machine Learning Preprocessing Method for Suicide Prediction

Machine Learning: Between Accuracy and Interpretability

A Survey of Computational Methods for Protein Function Prediction

Analysis of Prediction Accuracy of Diabetes Using Classifier and Hybrid Machine Learning Techniques

Combining RDR-Based Machine Learning Approach and Human Expert Knowledge for Phishing Prediction

Enhanced Security Using Elasticsearch and Machine Learning

Software Fault Prediction Using Machine-Learning Techniques

Heart Disease Prediction using Machine Learning Techniques