Alignment of RNA molecules: Binding energy and statistical properties of random sequences
- PDF / 384,557 Bytes
- 11 Pages / 612 x 792 pts (letter) Page_size
- 19 Downloads / 162 Views
TICAL, NONLINEAR, AND SOFT MATTER PHYSICS
Alignment of RNA Molecules: Binding Energy and Statistical Properties of Random Sequences O. V. Valbaa,*, S. K. Nechaevb,c,**, and M. V. Tammd,*** a
Moscow Institute of Physics and Technology (State University), Dolgoprudny, Moscow oblast, 141700 Russia b LPTMS, Université Paris Sud, 91405, Orsay Cedex, France cLebedev Physical Institute, Russian Academy of Sciences, Moscow, 119991 Russia d Moscow State University, Moscow, 119992 Russia *email: [email protected] **email: [email protected] ***email: [email protected] Received April 29, 2011
Abstract—A new statistical approach to the problem of pairwise alignment of RNA sequences is proposed. The problem is analyzed for a pair of interacting polymers forming an RNAlike hierarchical cloverleaf struc tures. An alignment is characterized by the numbers of matches, mismatches, and gaps. A weight function is assigned to each alignment; this function is interpreted as a free energy taking into account both direct mono mer–monomer interactions and a combinatorial contribution due to formation of various cloverleaf second ary structures. The binding free energy is determined for a pair of RNA molecules. Statistical properties are discussed, including fluctuations of the binding energy between a pair of RNA molecules and loop length dis tribution in a complex. Based on an analysis of the free energy per nucleotide pair complexes of random RNAs as a function of the number of nucleotide types c, a hypothesis is put forward about the exclusivity of the alphabet c = 4 used by nature. DOI: 10.1134/S1063776112020355
1. INTRODUCTION The role played by RNA and DNA in cellular reg ulation mechanisms is well known. These biopolymers are responsible for the storage and transmission of genetic information. Besides translation and tran scription (which involves DNA–RNA complexes), an extremely important role is played by RNA–RNA pairing interactions. These interactions are key to reg ulation of gene expression [1–3]. Schematically, the formation of an RNA–RNA complex occurs by com plementary base pairing of an RNA with a messenger RNA (mRNA) or its segment, which precludes trans lation from the mRNA [3]. The RNA molecules par ticipating in processes of this type are called noncod ing RNAs (ncRNAs) because they are not themselves translated into proteins [2] and are therefore left out of the transcription process. In view of the important role of RNA–RNA inter actions in biological processes, an efficient algorithm is required for theoretically calculating the RNA– RNA binding energy given the primary sequences, as well as for predicting ncRNA secondary structure (i.e., thermodynamically optimal intrachain bonding architecture). It is shown below that this problem is closely related to that of pairwise alignment of arbi
trary DNAlike sequences. One important distinction of RNA alignment from the analogous problem for DNA is the existence of nontrivial secondary structure of RNA molecules. RNA molecules belong to the cl
Data Loading...