Comprehensive genomic analysis reveals dynamic evolution of endogenous retroviruses that code for retroviral-like protei

  • PDF / 2,368,361 Bytes
  • 17 Pages / 595.276 x 790.866 pts Page_size
  • 27 Downloads / 163 Views

DOWNLOAD

REPORT


RESEARCH

Open Access

Comprehensive genomic analysis reveals dynamic evolution of endogenous retroviruses that code for retroviral-like protein domains Mahoko Takahashi Ueda1,2,3, Kirill Kryukov1,4, Satomi Mitsuhashi5,6, Hiroaki Mitsuhashi2,7, Tadashi Imanishi1,8 and So Nakagawa1,2,8*

Abstract Background: Endogenous retroviruses (ERVs) are remnants of ancient retroviral infections of mammalian germline cells. A large proportion of ERVs lose their open reading frames (ORFs), while others retain them and become exapted by the host species. However, it remains unclear what proportion of ERVs possess ORFs (ERV-ORFs), become transcribed, and serve as candidates for co-opted genes. Results: We investigated characteristics of 176,401 ERV-ORFs containing retroviral-like protein domains (gag, pro, pol, and env) in 19 mammalian genomes. The fractions of ERVs possessing ORFs were overall small (~ 0.15%) although they varied depending on domain types as well as species. The observed divergence of ERV-ORF from their consensus sequences showed bimodal distributions, suggesting that a large proportion of ERV-ORFs either recently, or anciently, inserted themselves into mammalian genomes. Alternatively, very few ERVs lacking ORFs were found to exhibit similar divergence patterns. To identify candidates for ERV-derived genes, we estimated the ratio of non-synonymous to synonymous substitution rates (dN/dS) for ERV-ORFs in human and non-human mammalian pairs, and found that approximately 42% of the ERV-ORFs showed dN/dS < 1. Further, using functional genomics data including transcriptome sequencing, we determined that approximately 9.7% of these selected ERV-ORFs exhibited transcriptional potential. Conclusions: These results suggest that purifying selection operates on a certain portion of ERV-ORFs, some of which may correspond to uncharacterized functional genes hidden within mammalian genomes. Together, our analyses suggest that more ERV-ORFs may be co-opted in a host-species specific manner than we currently know, which are likely to have contributed to mammalian evolution and diversification. Keywords: Endogenous retrovirus, Retroviral-like protein domain, Open reading frame, Evolution, Divergence pattern, Co-option, de novo gene

* Correspondence: [email protected] 1 Department of Molecular Life Science, Tokai University School of Medicine, Isehara, Kanagawa 259-1193, Japan 2 Micro/Nano Technology Center, Tokai University, Hiratsuka, Kanagawa 259-1292, Japan Full list of author information is available at the end of the article © The Author(s). 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicate