cPEA: a parallel method to perform pathway enrichment analysis using multiple pathways databases

PDF / 1,021,449 Bytes
12 Pages / 595.276 x 790.866 pts Page_size
77 Downloads / 301 Views

METHODOLOGIES AND APPLICATION

cPEA: a parallel method to perform pathway enrichment analysis using multiple pathways databases Giuseppe Agapito1

· Mario Cannataro2

© Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract Genes/proteins are essential to activate or inhibit biological pathways both inside or outside the cells in each living organism. The key to understand the functional roles of genes/proteins is the deduction of the relationship between pathways and genes/proteins. To understand the role of genes/proteins in a biological context, we can use pathway enrichment analysis (PEA), an essential method in omics research, to identify the biological role of genes/proteins. A large number of PEA methods and tools are available; nevertheless, only a few can perform PEA exploiting information coming from multiple databases in the same analysis. Many of these databases were initially developed to use their pathway representation format, resulting in a heterogeneous collection of resources that are extremely difficult to combine and use. Soft computing enables approximate solutions for problems challenging to solve precisely, such as merging and integrating structured and unstructured data, or data from different databases. The integration and merging of biological pathways from diverse data sources are challenging due to the different pathway data representations used. The use of parallel preprocessing methods to deal with approximation and imprecision can contribute to integrate heterogeneous pathway data. We implemented an automatic methodology to perform PEA using pathways coming from different databases and a method to compute topological scores to rank enriched pathways. This methodology is available in a software framework called cross-pathway enrichment analysis. The obtained results show good performance in terms of execution times and reduced memory consumption, allowing to improve PEA by using pathways coming from different databases. Keywords Parallel computing · Statistical analysis · Pathway enrichment analysis · Gene expression · SNP

1 Introduction After the sequencing of the whole DNA (Collins et al. 2003) which took place a few decades ago, it would seem that only a small part of the DNA about 5% is coding, while the remaining portion of the DNA about 95% has not an agreed Communicated by V. Loia.

B

Giuseppe Agapito [email protected] Mario Cannataro [email protected] http://dsmc.unicz.it/personale/docente/mariocannataro

1

Department of Legal, Economic and Social Sciences, and Data Analytics Research Center, University “Magna Græcia” of Catanzaro, Catanzaro, Italy

2

Department of Medical and Surgical Sciences, and Data Analytics Research Center, University “Magna Græcia” of Catanzaro, Catanzaro, Italy

meaning (i.e. a role in the various biological processes). The sequencing of a complete genome has also been reached thanks to the development of high-throughput (HT) methodologies. HT assays such as microarrays and next-generation sequencing (NGS) produce vast amounts of data

Data Loading...

cPEA: a parallel method to perform pathway enrichment analysis using multiple pathways databases

Recommend Documents

Determining Cell Death Pathway and Regulation by Enrichment Analysis

A Microarray-Based Method to Perform Nucleic Acid Selections

A Practical Guide to Using Glycomics Databases

Mining Multiple Large Databases

Elementary Mode Analysis: A Useful Metabolic Pathway Analysis Tool for Reprograming Microbial Metabolic Pathways

multiGSEA: a GSEA-based pathway enrichment analysis for multi-omics data

Data Analysis and Pattern Recognition in Multiple Databases

A particle packing parallel geometric method using GPU

Systematic analysis of immune-related genes based on a combination of multiple databases to build a diagnostic and a pro

Ring Resonator Systems to Perform Optical Communication Enhancement Using Soliton

Feast, Famine or Fighting? Multiple Pathways to Social Complexity

A method to predict solar photovoltaic soiling using artificial neural networks and multiple linear regression models