Dark matter RNA illuminates the puzzle of genome-wide association studies

  • PDF / 635,973 Bytes
  • 8 Pages / 595.28 x 793.7 pts Page_size
  • 0 Downloads / 170 Views

DOWNLOAD

REPORT


MINIREVIEW

Open Access

Dark matter RNA illuminates the puzzle of genome-wide association studies Georges St. Laurent1, Yuri Vyatkin1,2 and Philipp Kapranov1*

Abstract In the past decade, numerous studies have made connections between sequence variants in human genomes and predisposition to complex diseases. However, most of these variants lie outside of the charted regions of the human genome whose function we understand; that is, the sequences that encode proteins. Consequently, the general concept of a mechanism that translates these variants into predisposition to diseases has been lacking, potentially calling into question the validity of these studies. Here we make a connection between the growing class of apparently functional RNAs that do not encode proteins and whose function we do not yet understand (the so-called ‘dark matter’ RNAs) and the disease-associated variants. We review advances made in a different genomic mapping effort – unbiased profiling of all RNA transcribed from the human genome – and provide arguments that the disease-associated variants exert their effects via perturbation of regulatory properties of non-coding RNAs existing in mammalian cells. Keywords: Genome-wide association study, Non-coding RNA, vlincRNA, Intronic RNA, lncRNA, RNA scaffold, LincRNA, Long Non-coding RNA, Long intergenic non-coding RNA, Very long intergenic non-coding RNA

Introduction Connecting variations in DNA sequence with a biological or medical phenotype has long served to map functional elements of a genome. The recent genomics revolution has facilitated the identification of such variants on a massive scale, ushering in the era of genome-wide association studies (GWAS). Since the first pioneering report in 2005 [1], hundreds of such analyses have identified thousands of changes in DNA sequence (primarily single nucleotide polymorphisms (SNPs)) associated with a large number of complex diseases (cancers, heart disease, brain disorders, obesity, and many others; [2,3]. However, most of these variants have accumulated in unannotated, noncoding regions of the genome, whose functions continue to pose an enigma (Figure 1). Therefore, much of the wealth of GWAS information remains unrealized, with the mechanisms of action of the underlying genomic regions unknown, despite their widespread associations with disease.

* Correspondence: [email protected] 1 St. Laurent Institute, 317 New Boston St, Suite 201, Woburn, MA 01801, USA Full list of author information is available at the end of the article

Pervasive transcription: the answer to function behind the non-coding GWAS variants?

Only 2 to 3% of human DNA (genome) encodes proteins, the building blocks of life whose function we understand fairly well. The remaining 97 to 98% represent non-coding sequences, which were long considered ‘junk DNA’ because they did not fit the protein-centric view that dominated biology for decades. The goal of connecting DNA sequence variants to this protein-coding sliver of the human genome has shaped their interpretat