A vast resource of allelic expression data spanning human tissues

  • PDF / 1,060,587 Bytes
  • 12 Pages / 595.276 x 793.701 pts Page_size
  • 58 Downloads / 176 Views

DOWNLOAD

REPORT


SHORT REPORT

Open Access

A vast resource of allelic expression data spanning human tissues Stephane E. Castel1,2* , François Aguet3, Pejman Mohammadi1,2,4,5, GTEx Consortium, Kristin G. Ardlie3 and Tuuli Lappalainen1,2* * Correspondence: scastel@ nygenome.org; tlappalainen@ nygenome.org 1 New York Genome Center, New York, NY, USA Full list of author information is available at the end of the article

Abstract Allele expression (AE) analysis robustly measures cis-regulatory effects. Here, we present and demonstrate the utility of a vast AE resource generated from the GTEx v8 release, containing 15,253 samples spanning 54 human tissues for a total of 431 million measurements of AE at the SNP level and 153 million measurements at the haplotype level. In addition, we develop an extension of our tool phASER that allows effect sizes of cis-regulatory variants to be estimated using haplotype-level AE data. This AE resource is the largest to date, and we are able to make haplotype-level data publicly available. We anticipate that the availability of this resource will enable future studies of regulatory variation across human tissues. Keywords: ASE, Allelic expression, eQTL, Regulatory variation, Genomics, Functional genomics, GTEx

Background Allelic expression (AE, also known as allele-specific expression or ASE) analysis is a powerful technique that can be used to measure the expression of gene alleles relative to one another within single individuals. This makes it well suited to measure cis-acting regulatory variation using imbalance between alleles in heterozygous individuals (Fig. 1a) [1]. AE analysis can capture both common cis-regulatory variation, for example, expression quantitative trait loci (eQTLs), and rare regulatory variation [2]. It can also be used to measure allele-specific epigenetic effects such as parent of origin imprinting [3]. In practice, AE analysis uses RNA-seq reads that overlap heterozygous single nucleotide polymorphisms (SNPs), where the SNP can be used to assign the read to an allele. These heterozygous SNPs capture the cumulative effects of cis-regulatory variation acting on each allele. Allelic imbalance occurs when the two alleles of a gene are expressed at different levels. The magnitude of the imbalance can be quantified by allelic fold change (aFC) [1], and the statistical significance of the imbalance can be evaluated using binomialbased statistics to account for the count-based nature of the data [4]. In some cases, these effects can be caused by the SNPs being used to measure AE themselves, for example, © The Author(s). 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Common