A multi-objective based PSO approach for inferring pathway activity utilizing protein interactions
- PDF / 977,915 Bytes
- 21 Pages / 439.642 x 666.49 pts Page_size
- 39 Downloads / 182 Views
A multi-objective based PSO approach for inferring pathway activity utilizing protein interactions Pratik Dutta1 · Sriparna Saha1 · Sukanya Naskar2 Received: 26 November 2019 / Revised: 1 May 2020 / Accepted: 24 June 2020 / © Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract The pathway information of a given microarray gene expression data can be collected from the available public databases. Inferring the activity of a pathway is a crucial task in functional genomics. In general, the set of genes that are associated with a given pathway are equally considered for measuring goodness. But the contribution of each gene should be quantified differently. In the current study, we have quantified the degrees of relevance of different genes participating in a pathway by optimizing different goodness measures of pathway activity. Two popular goodness measures, namely t-score and z-score are modified to measure the goodness of the weighted gene vectors. Moreover, another goodness measure based on the protein-protein interaction scores of pairs of genes participated in a pathway is utilized as another objective function. All these measures are designed to handle the weighted importance of individual genes. The search capability of a multiobjective based particle swarm optimization (PSO) is utilized for searching the appropriate relevance vectors for different genes. The proposed approach is applied to five real-life gene expression datasets, and the performance is compared with eight existing feature selection methods. The comparative results demonstrate the superiority of the proposed particle swarm optimization based technique. The efficacy of the performance of the proposed method is validated by using a statistical significance test, and further, a biological significant test is done to justify the biological relevance of the extracted pathway-based gene markers. Keywords Pathway activity · Particle swarm optimization · Gene markers · Protein-protein interaction · Weighted t-score
1 Introduction Microarray technology is a widely used technique in measuring the expression levels of thousands of genes. The genes can be grouped into different clusters based on the expression profiles over different conditions/samples. A microarray data can be viewed as a Sriparna Saha
[email protected]
Extended author information available on the last page of the article.
Multimedia Tools and Applications
2-dimensional matrix E=[eij ]m×n where m is the number of samples or time points and n is the number of genes. Each element of the microarray data represents expression level of the j th gene for the i th sample or i th time point. The grouping of genes based on expression profile helps in inferring possible activities of genes. Genes which are expressed distinctively over different samples can have a remarkable influence in discovering disease states or for drug target prediction in medical treatments [48]. These genes are also termed as genemarkers. The automatic selection of gene markers is a diffic
Data Loading...