DNA Secondary Structure Prediction by Fuzzy Logic Inference Systems

  • PDF / 137,625 Bytes
  • 6 Pages / 594 x 792 pts Page_size
  • 71 Downloads / 232 Views

DOWNLOAD

REPORT


SOFTWARE–HARDWARE SYSTEMS DNA SECONDARY STRUCTURE PREDICTION BY FUZZY LOGIC INFERENCE SYSTEMS I. V. Sergienkoa† and O. O. Provotar a‡

UDC 681.3

Abstract. The construction of fuzzy logic inference systems to predict DNA secondary structure is discussed. An example of predicting the structure of the central residue of MutS protein as an output of a fuzzy system with Mamdani’s inference procedure is presented. Keywords: fuzzy system, DNA secondary structure, fuzzy set. INTRODUCTION The spatial DNA structure is known [1, 2] to be determined by its amino acid sequence. In turn, the spatial structure determines the functionality of protein. Predicting protein structures of different hierarchical levels is a rather complex problem. To solve it, various methods and approaches are used, including experimental (based on the physics of the formation of chemical bonds), machine learning (using databases of experimentally found secondary structures as training samples), and probabilistic ones (based on Bayesian procedures and Markov chains). In the paper, we propose a method to predict DNA secondary structure with the use of fuzzy logic inference systems. The task is as follows: construct a fuzzy logic inference system such that, given an arbitrary amino acid sequence, would determine (in the form of a fuzzy set) the secondary structure of the central residue (amino acid) of the input sequence. To solve this problem, first of all, it is necessary to design a fuzzy system based on training samples. CONSTRUCTION OF FUZZY RULES One of the methods of constructing a system of fuzzy rules based on numerical data is presented in [3]. It is as follows. For simplicity, let a rulebase with two inputs and one output be created. This needs training data (samples) in the form of a set ( x1 ( i ), x 2 ( i ); d ( i )), i = 1,2, K , m , where x1 ( i ) and x 2 ( i ) are inputs of the fuzzy control module and d ( i ) is the output of the fuzzy control module. The task is to construct fuzzy rules such that the fuzzy logic inference system organized on their basis would generate correct output data based on input data. The solution algorithm of this problem reduces to the following sequence of steps. Step 1. Dividing the space of inputs and outputs into domains and determining the respective membership functions. Divide each input and output into 2N + 1 intervals, where N is a number selected individually for each input. Denote separate domains (intervals) as follows: M N ( left N ), K , M 1 ( left 1), S ( middle ), D1 ( right 1), K , D N ( right N ) . Each membership function has certain form, for example, triangular (Fig. 1). a

V. M. Glushkov Institute of Cybernetics, National Academy of Sciences of Ukraine, Kyiv, Ukraine, [email protected]; ‡[email protected]. Translated from Kibernetika i Sistemnyi Analiz, No. 1, January–February, 2014, pp. 125–130. Original article submitted October 17, 2013.



110

1060-0396/14/5001-0110

©

2014 Springer Science+Business Media New York

Y M2

M1

S

D1

D2

X

Fig. 1 Step 2. Constructing f