Deep learning based DNA:RNA triplex forming potential prediction

  • PDF / 2,121,713 Bytes
  • 13 Pages / 595.276 x 790.866 pts Page_size
  • 34 Downloads / 192 Views

DOWNLOAD

REPORT


Open Access

SOFTWARE

Deep learning based DNA:RNA triplex forming potential prediction Yu Zhang1, Yahui Long2 and Chee Keong Kwoh1* 

*Correspondence: [email protected] 1 School of Computer Science and Engineering, Nanyang Technological University, Singapore 639798, Singapore Full list of author information is available at the end of the article

Abstract  Background:  Long non-coding RNAs (lncRNAs) can exert functions via forming triplex with DNA. The current methods in predicting the triplex formation mainly rely on mathematic statistic according to the base paring rules. However, these methods have two main limitations: (1) they identify a large number of triplex-forming lncRNAs, but the limited number of experimentally verified triplex-forming lncRNA indicates that maybe not all of them can form triplex in practice, and (2) their predictions only consider the theoretical relationship while lacking the features from the experimentally verified data. Results:  In this work, we develop an integrated program named TriplexFPP (Triplex Forming Potential Prediction), which is the first machine learning model in DNA:RNA triplex prediction. TriplexFPP predicts the most likely triplex-forming lncRNAs and DNA sites based on the experimentally verified data, where the high-level features are learned by the convolutional neural networks. In the fivefold cross validation, the average values of Area Under the ROC curves and PRC curves for removed redundancy triplex-forming lncRNA dataset with threshold 0.8 are 0.9649 and 0.9996, and these two values for triplex DNA sites prediction are 0.8705 and 0.9671, respectively. Besides, we also briefly summarize the cis and trans targeting of triplexes lncRNAs. Conclusions:  The TriplexFPP is able to predict the most likely triplex-forming lncRNAs from all the lncRNAs with computationally defined triplex forming capacities and the potential of a DNA site to become a triplex. It may provide insights to the exploration of lncRNA functions. Keywords:  Long noncoding RNAs, DNA:RNA triplex, Deep learning

Background The advances in sequencing technologies enable the discovery of the vast amount of Long non-coding RNAs (lncRNAs). lncRNAs can serve as signals, decoys, guides, and scaffolds to carry out functions like chromatin states modulation and gene expression regulation. They act via the interactions with DNA, protein, and other RNA, in the way of coordinating regulatory proteins, localizing to target loci, shaping three-dimensional (3D) nuclear organization [1–3], etc.

© The Author(s) 2020. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in