Modeling the Dependence Structure in Genome Wide Association Studies of Binary Phenotypes in Family Data

PDF / 1,596,381 Bytes
17 Pages / 595.276 x 790.866 pts Page_size
14 Downloads / 207 Views

ORIGINAL RESEARCH

Modeling the Dependence Structure in Genome Wide Association Studies of Binary Phenotypes in Family Data Souvik Seal1 · Jeffrey A. Boatman1 · Matt McGue2 · Saonli Basu1 Received: 10 November 2019 / Accepted: 27 July 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Genome-wide association studies (GWASs) are a popular tool for detecting association between genetic variants or single nucleotide polymorphisms (SNPs) and complex traits. Family data introduce complexity due to the non-independence of the family members. Methods for non-independent data are well established, but when the GWAS contains distinct family types, explicit modeling of between-family-type differences in the dependence structure comes at the cost of significantly increased computational burden. The situation is exacerbated with binary traits. In this paper, we perform several simulation studies to compare multiple candidate methods to perform single SNP association analysis with binary traits. We consider generalized estimating equations (GEE), generalized linear mixed models (GLMMs), or generalized least square (GLS) approaches. We study the influence of different working correlation structures for GEE on the GWAS findings and also the performance of different analysis method(s) to conduct a GWAS with binary trait data in families. We discuss the merits of each approach with attention to their applicability in a GWAS. We also compare the performances of the methods on the alcoholism data from the Minnesota Center for Twin and Family Research (MCTFR) study. Keywords Family data · Population-based association analysis · Genome-wide scan · Generalized estimating equation · Generalized linear mixed effect model · Generalized least squares

Introduction Genome-wide association studies (GWASs) seek to detect associations between genetic variants and observed disease phenotypes. Many such GWASs involve analyzing family data (Duerr et al. 2006; Graham et al. 2008; Benyamin et al. 2009), but explicit modeling of the dependencies among the family members introduces additional complexities. The observations within a family are correlated due to both shared environment and shared genes, which complicates the statistical modeling. Methods for analyzing quantitative traits with family data have been well developed; methods for conducting a genome-wide association study with binary Edited by Stacey Cherny. * Souvik Seal [email protected] 1

Division of Biostatistics, University of Minnesota, Minneapolis, MN, USA

Department of Psychology, University of Minnesota, Minneapolis, MN, USA

2

traits have received less attention. One class of methods for analyzing binary family data, generalized linear mixed models (GLMMs), introduces random effects into the statistical models to account for the within-family dependencies. Other methods, such as generalized estimating equations (GEE) (Liang and Zeger 1986) estimate the marginal, populationaveraged effect of a genetic variant on the phenotype. Another clas

Data Loading...

Modeling the Dependence Structure in Genome Wide Association Studies of Binary Phenotypes in Family Data

Recommend Documents

Genome-Wide Association Studies

Mixed logistic regression in genome-wide association studies

Role of Genome-Wide Association Studies in Host Genetics: Toward Understanding of Microbiome Association

Neurogenetics, Genome-Wide Association and Candidate Gene Studies

Genome-Wide Association Study (GWAS)

Genome-wide analysis of the WRKY gene family in the cucumber genome and transcriptome-wide identification of WRKY transc

Genome-Wide Association Studies in Arabidopsis thaliana: Statistical Analysis and Network-Based Augmentation of Signals

Heritability of Alcohol Use Disorder: Evidence from Twin Studies and Genome-Wide Association Studies

Dark matter RNA illuminates the puzzle of genome-wide association studies

Longitudinal Characteristics of Glioblastoma in Genome-Wide Studies

Modeling Bivariate Binary Data

Genetics of Depressive Disorders: Candidate Genes and Genome-Wide Association Studies