Robust Methods for Expression Quantitative Trait Loci Mapping

As a promising tool for dissecting the genetic basis of common diseases, expression quantitative trait loci (eQTL) study has attracted increasing research interest. The traditional eQTL methods focus on testing the associations between individual single-n

  • PDF / 2,943,340 Bytes
  • 64 Pages / 439.36 x 666.15 pts Page_size
  • 6 Downloads / 231 Views

DOWNLOAD

REPORT


Abstract As a promising tool for dissecting the genetic basis of common diseases, expression quantitative trait loci (eQTL) study has attracted increasing research interest. The traditional eQTL methods focus on testing the associations between individual single-nucleotide polymorphisms (SNPs) and gene expression traits. A major drawback of this approach is that it cannot model the joint effect of a set of SNPs on a set of genes, which may correspond to biological pathways. In this chapter, we study the problem of identifying group-wise associations in eQTL mapping. Based on the intuition of group-wise association, we examine how the integration of heterogeneous prior knowledge on the correlation structures between SNPs, and between genes can improve the robustness and the interpretability of eQTL mapping. Keywords Robust methods • eQTL • Gene expression • Parameter analysis • Biostatistics

1 Introduction The most abundant sources of genetic variations in modern organisms are singlenucleotide polymorphisms (SNPs). An SNP is a DNA sequence variation occurring when a single nucleotide (A, T, G, or C) in the genome differs between individuals of a species. For inbred diploid organisms, such as inbred mice, an SNP usually shows variation between only two of the four possible nucleotide types [26], which

W. Cheng () NEC Laboratories America, Inc., Princeton, NJ, USA e-mail: [email protected]; [email protected] X. Zhang Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH, USA e-mail: [email protected] W. Wang Department of Computer Science, University of California, Los Angeles, CA, USA e-mail: [email protected] © Springer International Publishing Switzerland 2016 K.-C. Wong (ed.), Big Data Analytics in Genomics, DOI 10.1007/978-3-319-41279-5_2

25

26

W. Cheng et al.

allows us to represent it by a binary variable. The binary representation of an SNP is also referred to as the genotype of the SNP. The genotype of an organism is the genetic code in its cells. This genetic constitution of an individual influences, but is not solely responsible for, many of its traits. A phenotype is an observable trait or characteristic of an individual. The phenotype is the visible, or expressed trait, such as hair color. The phenotype depends upon the genotype but can also be influenced by environmental factors. Phenotypes can be either quantitative or binary. Driven by the advancement of cost-effective and high-throughput genotyping technologies, genome-wide association studies (GWAS) have revolutionized the field of genetics by providing new ways to identify genetic factors that influence phenotypic traits. Typically, GWAS focus on associations between SNPs and traits like major diseases. As an important subsequent analysis, quantitative trait locus (QTL) analysis is aiming at to detect the associations between two types of information—quantitative phenotypic data (trait measurements) and genotypic data (usually SNPs)—in an attempt to explain the genetic basis of var