QGRS-Conserve: a computational method for discovering evolutionarily conserved G-quadruplex motifs

  • PDF / 561,758 Bytes
  • 13 Pages / 595.28 x 793.7 pts Page_size
  • 88 Downloads / 148 Views

DOWNLOAD

REPORT


PRIMARY RESEARCH

Open Access

QGRS-Conserve: a computational method for discovering evolutionarily conserved G-quadruplex motifs Scott Frees1*, Camille Menendez2, Matt Crum2 and Paramjeet S Bagga2

Abstract Background: Nucleic acids containing guanine tracts can form quadruplex structures via non-Watson-Crick base pairing. Formation of G-quadruplexes is associated with the regulation of important biological functions such as transcription, genetic instability, DNA repair, DNA replication, epigenetic mechanisms, regulation of translation, and alternative splicing. G-quadruplexes play important roles in human diseases and are being considered as targets for a variety of therapies. Identification of functional G-quadruplexes and the study of their overall distribution in genomes and transcriptomes is an important pursuit. Traditional computational methods map sequence motifs capable of forming G-quadruplexes but have difficulty in distinguishing motifs that occur by chance from ones which fold into G-quadruplexes. Results: We present Quadruplex forming ā€˜Gā€™-rich sequences (QGRS)-Conserve, a computational method for calculating motif conservation across exomes and supports filtering to provide researchers with more precise methods of studying G-quadruplex distribution patterns. Our method quantitatively evaluates conservation between quadruplexes found in homologous nucleotide sequences based on several motif structural characteristics. QGRS-Conserve also efficiently manages overlapping G-quadruplex sequences such that the resulting datasets can be analyzed effectively. Conclusions: We have applied QGRS-Conserve to identify a large number of G-quadruplex motifs in the human exome conserved across several mammalian and non-mammalian species. We have successfully identified multiple homologs of many previously published G-quadruplexes that play post-transcriptional regulatory roles in human genes. Preliminary large-scale analysis identified many homologous G-quadruplexes in the 5ā€²- and 3ā€²-untranslated regions of mammalian species. An expectedly smaller set of G-quadruplex motifs was found to be conserved across larger phylogenetic distances. QGRS-Conserve provides means to build datasets that can be filtered and categorized in a variety of biological dimensions for more targeted studies in order to better understand the roles that G-quadruplexes play. Keywords: G-quadruplex, Computational method, Cis-regulatory motifs, Evolutionary conservation

Background Nucleic acids containing guanine tracts can form quadruplex structures via non-Watson-Crick base pairing [1,2]. These structures, known as G-quadruplexes, are composed of stacked G-tetrads - square co-planar arrays of four guanine bases each (Figure 1). Each tetrad in the G-quadruplex is stabilized by cyclic Hoogsteen hydrogen bonding between the four guanines [3-6]. G-quadruplexes can be formed by unimolecular interactions via repeated * Correspondence: [email protected] 1 Department of Computer Science, Ramapo College of New Jersey, 505 Ramapo Valley Road, Mahwah, NJ