Protein structure prediction problem: formalization using quaternions

  • PDF / 90,242 Bytes
  • 6 Pages / 594 x 792 pts Page_size
  • 60 Downloads / 219 Views

DOWNLOAD

REPORT


PROTEIN STRUCTURE PREDICTION PROBLEM: FORMALIZATION USING QUATERNIONS L. F. Hulianytskyia† and V. O. Rudyka‡

UDC 519.8

Abstract. The authors discuss the formalization of protein tertiary structure prediction problem based on Dill’s HP-model. Three-dimensional discrete lattices and different approaches to representing paths on them are the subjects of investigation. Two ways of path encoding are proposed and formalized, one of which is based on quaternions. Keywords: protein folding, tertiary protein structure molecule, discrete lattices, quaternions. INTRODUCTION The problem of predicting the tertiary structure of a protein molecule called also the protein folding problem has been recently one of the most important and investigated problems in computational biology. This structure plays a key role in determining the functional properties of proteins and is a source of information that is important to obtain fundamental and applied results in various fields of science and technology such as bioinformatics, medicine, pharmaceutics, computing geometry, and nanotechnologies. The essence of the protein tertiary structure prediction problem is as follows: given a linear sequence of elements constituting a molecule, determine its three-dimensional configuration. To solve this problem, the well-known experimental approaches are used (such as X-ray crystallography and magnetic-resonance spectroscopy); however, they are not only expensive and time-consuming but also not always give satisfactory results in practice. Therefore, mathematical methods have been widely applied in recent years in the analysis of the structure of molecules. The biophysical protein folding HP-model proposed by K. Dill in 1985 is most extensively studied: it finds the molecule structure that minimizes energy potential [1–3]. The mathematical solution of the problem is a continuous two-dimensional or three-dimensional curve without self-intersections. An overwhelming majority of the well-known protein structure models are discrete since the shape of the molecule is represented by a path in a discrete lattice. Noteworthy is that even in case of significantly simplified chemical and biological properties of protein, the mathematical modeling involves NP-hard optimization problems [4]. To simplify the model, two-dimensional or three-dimensional cubic lattices are considered most often [5–11], though they have shortcomings in the context of this problem. Passing to more complex lattices necessitates path encoding as a mathematical object that reflects its characteristics in the best way; the search using various optimization algorithms will be carried out among these objects. An important condition of the successful application of the methods of modeling protein spatial structure is the choice of adequate mathematical tools for the formal description of the problem. In the present paper we will briefly outline the principles that underlie Dill’s model and present the properties of lattices as mathematical objects, in particular, the invariance with respe