Sparse RNA folding revisited: space-efficient minimum free energy structure prediction

  • PDF / 2,678,517 Bytes
  • 13 Pages / 595.276 x 790.866 pts Page_size
  • 8 Downloads / 164 Views

DOWNLOAD

REPORT


Algorithms for Molecular Biology Open Access

REVIEW ARTICLE

Sparse RNA folding revisited: space‑efficient minimum free energy structure prediction Sebastian Will1* and Hosna Jabbari2,3,4

Abstract  Background:  RNA secondary structure prediction by energy minimization is the central computational tool for the analysis of structural non-coding RNAs and their interactions. Sparsification has been successfully applied to improve the time efficiency of various structure prediction algorithms while guaranteeing the same result; however, for many such folding problems, space efficiency is of even greater concern, particularly for long RNA sequences. So far, spaceefficient sparsified RNA folding with fold reconstruction was solved only for simple base-pair-based pseudo-energy models. Results:  Here, we revisit the problem of space-efficient free energy minimization. Whereas the space-efficient minimization of the free energy has been sketched before, the reconstruction of the optimum structure has not even been discussed. We show that this reconstruction is not possible in trivial extension of the method for simple energy models. Then, we present the time- and space-efficient sparsified free energy minimization algorithm SparseMFEFold that guarantees MFE structure prediction. In particular, this novel algorithm provides efficient fold reconstruction based on dynamically garbage-collected trace arrows. The complexity of our algorithm depends on two parameters, the number of candidates Z and the number of trace arrows T; both are bounded by n2, but are typically much smaller. The time complexity of RNA folding is reduced from O(n3 ) to O(n2 + nZ); the space complexity, from O(n2 ) to O(n + T + Z). Our empirical results show more than 80 % space savings over RNAfold [Vienna RNA package] on the long RNAs from the RNA STRAND database (≥2500 bases). Conclusions:  The presented technique is intentionally generalizable to complex prediction algorithms; due to their high space demands, algorithms like pseudoknot prediction and RNA–RNA-interaction prediction are expected to profit even stronger than “standard” MFE folding. SparseMFEFold is free software, available at http://www.bioinf.unileipzig.de/~will/Software/SparseMFEFold. Keywords:  Space efficient sparsification, Pseudoknot-free RNA folding, RNA secondary structure prediction Background The manifold catalytic and regulatory functions of noncoding RNAs are mediated by the formation of intermolecular structures with other RNAs or proteins, as well as their intra-molecular structures [3, 5, 9]. Currently computational RNA structure prediction methods mainly focus on predicting RNA secondary structure—the set of *Correspondence: [email protected] 1 Bioinformatics/IZBI, University Leipzig, Härtelstrasse 16–18, Leipzig, Germany Full list of author information is available at the end of the article

base pairs that form when RNA molecules fold. There is evidence that RNA molecules in their natural environments tend to fold into their minimum free energy (MFE) secondary structure