A consensus multi-view multi-objective gene selection approach for improved sample classification

PDF / 2,113,986 Bytes
16 Pages / 595 x 794 pts Page_size
18 Downloads / 296 Views

METHODOLOGY

Open Access

A consensus multi-view multi-objective gene selection approach for improved sample classification Sudipta Acharya1 , Laizhong Cui1* and Yi Pan2 From The 18th Asia Pacific Bioinformatics Conference Seoul, Korea. 18-20 August 2020 *Correspondence: [email protected] 1 Big Data Institute, College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, PR China Full list of author information is available at the end of the article

Abstract Background: In the field of computational biology, analyzing complex data helps to extract relevant biological information. Sample classification of gene expression data is one such popular bio-data analysis technique. However, the presence of a large number of irrelevant/redundant genes in expression data makes a sample classification algorithm working inefficiently. Feature selection is one such high-dimensionality reduction technique that helps to maximize the effectiveness of any sample classification algorithm. Recent advances in biotechnology have improved the biological data to include multi-modal or multiple views. Different ‘omics’ resources capture various equally important biological properties of entities. However, most of the existing feature selection methodologies are biased towards considering only one out of multiple biological resources. Consequently, some crucial aspects of available biological knowledge may get ignored, which could further improve feature selection efficiency. Results: In this present work, we have proposed a Consensus Multi-View Multi-objective Clustering-based feature selection algorithm called CMVMC. Three controlled genomic and proteomic resources like gene expression, Gene Ontology (GO), and protein-protein interaction network (PPIN) are utilized to build two independent views. The concept of multi-objective consensus clustering has been applied within our proposed gene selection method to satisfy both incorporated views. Gene expression data sets of Multiple tissues and Yeast from two different organisms (Homo Sapiens and Saccharomyces cerevisiae, respectively) are chosen for experimental purposes. As the end-product of CMVMC, a reduced set of relevant and non-redundant genes are found for each chosen data set. These genes finally participate in an effective sample classification. (Continued on next page)

© The Author(s). 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the

Data Loading...

A consensus multi-view multi-objective gene selection approach for improved sample classification

Recommend Documents

A Neuroevolutionary Approach to Feature Selection Using Multiobjective Evolutionary Algorithms

Feature and Sample Size Selection for Malware Classification Process

A Bottom-Up Approach for Licences Classification and Selection

Feature selection for improved classification accuracy targeting riverine sand mapping

Robust multiview feature selection via view weighted

Model Selection for Classification

A Feature Selection Approach to Visual Domain Adaptation in Classification

Improved Normalization Approach for Iris Image Classification Using SVM

A Systematic Approach to Multiobjective Optimization

Sample-Based Classification

Hill Climbing Algorithm for Random Sample Consensus Methods

A Novel Improved Algorithm for Protein Classification Through a Graph Similarity Approach