MCC-SP: a powerful integration method for identification of causal pathways from genetic variants to complex disease
- PDF / 1,083,264 Bytes
- 12 Pages / 595.276 x 790.866 pts Page_size
- 13 Downloads / 137 Views
RESEARCH ARTICLE
Open Access
MCC-SP: a powerful integration method for identification of causal pathways from genetic variants to complex disease Yuchen Zhu1†, Jiadong Ji2†, Weiqiang Lin1, Mingzhuo Li1, Lu Liu1, Huanhuan Zhu3,4, Fuzhong Xue1, Xiujun Li1, Xiang Zhou3,4 and Zhongshang Yuan1*
Abstract Background: Genome-wide association studies (GWAS) have successfully identified genetic susceptible variants for complex diseases. However, the underlying mechanism of such association remains largely unknown. Most diseaseassociated genetic variants have been shown to reside in noncoding regions, leading to the hypothesis that regulation of gene expression may be the primary biological mechanism. Current methods to characterize gene expression mediating the effect of genetic variant on diseases, often analyzed one gene at a time and ignored the network structure. The impact of genetic variant can propagate to other genes along the links in the network, then to the final disease. There could be multiple pathways from the genetic variant to the final disease, with each having the chain structure since the first node is one specific SNP (Single Nucleotide Polymorphism) variant and the end is disease outcome. One key but inadequately addressed question is how to measure the between-node connection strength and rank the effects of such chain-type pathways, which can provide statistical evidence to give the priority of some pathways for potential drug development in a cost-effective manner. Results: We first introduce the maximal correlation coefficient (MCC) to represent the between-node connection, and then integrate MCC with K shortest paths algorithm to rank and identify the potential pathways from genetic variant to disease. The pathway importance score (PIS) was further provided to quantify the importance of each pathway. We termed this method as “MCC-SP”. Various simulations are conducted to illustrate MCC is a better measurement of the between-node connection strength than other quantities including Pearson correlation, Spearman correlation, distance correlation, mutual information, and maximal information coefficient. Finally, we applied MCC-SP to analyze one real dataset from the Religious Orders Study and the Memory and Aging Project, and successfully detected 2 typical pathways from APOE genotype to Alzheimer’s disease (AD) through gene expression enriched in Alzheimer’s disease pathway. Conclusions: MCC-SP has powerful and robust performance in identifying the pathway(s) from the genetic variant to the disease. The source code of MCC-SP is freely available at GitHub (https://github.com/zhuyuchen95/ADnet). Keywords: Maximum correlation coefficient, K shortest paths algorithms, Integration method, Pathway, Alzheimer’s disease
* Correspondence: [email protected] † Yuchen Zhu and Jiadong Ji contributed equally to this work. 1 Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan 250012, Shandong, China Full list of author information is available at t
Data Loading...