Two-phase SSU and SKAT in genetic association studies
- PDF / 5,613,137 Bytes
- 10 Pages / 595.276 x 790.866 pts Page_size
- 91 Downloads / 204 Views
Ó Indian Academy of Sciences (0123456789().,-volV) (0123456789().,-volV)
RESEARCH ARTICLE
Two-phase SSU and SKAT in genetic association studies YUAN XUE1,2
, JUAN DING3, JINJUAN WANG1,4, SANGUO ZHANG1,2 and DONGDONG PAN5*
1School
of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, People’s Republic of China 2Key Laboratory of Big Data Mining and Knowledge Management, Chinese Academy of Sciences, Beijing 100049, People’s Republic of China 3School of Mathematics and Statistics, Guangxi Normal University, Guilin 541004, People’s Republic of China 4LSC, NCMIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, People’s Republic of China 5Yunnan Key Laboratory of Statistical Modeling and Data Analysis, Yunnan University, Kunming 650500, People’s Republic of China *For correspondence. E-mail: [email protected]. Received 6 November 2018; revised 26 July 2019; accepted 23 October 2019 Abstract. The sum of squared score (SSU) and sequence kernel association test (SKAT) are the two good alternative tests for genetic association studies in case–control data. Both SSU and SKAT are derived through assuming a dose-response model between the risk of disease and genotypes. However, in practice, the real genetic mode of inheritance is impossible to know. Thus, these two tests might lose power substantially as shown in simulation results when the genetic model is misspecified. Here, to make both the tests suitable in broad situations, we propose two-phase SSU (tpSSU) and two-phase SKAT (tpSKAT), where the Hardy–Weinberg equilibrium test is adopted to choose the genetic model in the first phase and the SSU and SKAT are constructed corresponding to the selected genetic model in the second phase. We found that both tpSSU and tpSKAT outperformed the original SSU and SKAT in most of our simulation scenarios. By applying tpSSU and tpSKAT to the study of type 2 diabetes data, we successfully identified some genes that have direct effects on obesity. Besides, we also detected the significant chromosomal region 10q21.22 in GAW16 rheumatoid arthritis dataset, with P \ 10-6. These findings suggest that tpSSU and tpSKAT can be effective in identifying genetic variants for complex diseases in case–control association studies. Keywords.
multiple-markers analysis; genetic model; Hardy–Weinberg equilibrium; power.
Introduction In the recent decade, large-scale genetic study, especially genomewide association studies (GWAS) has led to the discovery of genetic variants for human complex diseases and traits. A standard case–control GWAS inevitably genotypes a large number of single-nucleotide polymorphisms (SNPs). A simple and popular approach is to analyse single SNP first and then use a stringent level to select the deleterious SNPs. Although, the single-SNP analysis has been proven valid in detecting many disease-susceptibility variants, it can be inefficient because the adjustment to control the false positive rates is conservative, and the genetic effect is weak. Often with the
Data Loading...