Prediction of Signal Peptides Using Bio-Basis Function Neural Networks and Decision Trees

  • PDF / 186,691 Bytes
  • 7 Pages / 592.56 x 750.96 pts Page_size
  • 84 Downloads / 181 Views

DOWNLOAD

REPORT


BIOMEDICAL GENOMICS AND PROTEOMICS

© 2006 Adis Data Information BV. All rights reserved.

Prediction of Signal Peptides Using Bio-Basis Function Neural Networks and Decision Trees Ateesh Sidhu1 and Zheng Rong Yang2 1

Biological Science, University of Warwick, Coventry, UK

2

Department of Computer Science, University of Exeter, Exeter, UK

Abstract

Signal peptide identification is of immense importance in drug design. Accurate identification of signal peptides is the first critical step to be able to change the direction of the targeting proteins and use the designed drug to target a specific organelle to correct a defect. Because experimental identification is the most accurate method, but is expensive and time-consuming, an efficient and affordable automated system is of great interest. In this article, we propose using an adapted neural network, called a bio-basis function neural network, and decision trees for predicting signal peptides. The bio-basis function neural network model and decision trees achieved 97.16% and 97.63% accuracy respectively, demonstrating that the methods work well for the prediction of signal peptides. Moreover, decision trees revealed that position P1′, which is important in forming signal peptides, most commonly comprises either leucine or alanine. This concurs with the (P3-P1-P1′) coupling model.

Protein signals have become a crucial tool for scientists in drug design. New drugs can be constructed using the protein signals to target a certain organelle and correct a specific defect.[1] Identifying signal peptides experimentally is expensive and time-consuming. With a large amount of unprocessed sequence data available, there has been a growing interest in automated methods for the prediction of signal peptides in protein sequences.[2] An efficient and affordable automated system will be of great benefit in assisting and speeding-up experimental identification. Signal peptides direct the proteins to their respective cellular and extracellular locations. One example would be the translocation of proteins across the cytoplasmic membrane via the secretory pathway found in both eukaryotes and prokaryotes.[3] The protein to be translocated is labelled with the signal peptides at its N-terminus. The signal peptide is usually cleaved by an extracellular peptidase after the translocation. The peptides have a positively charged n-region, followed by a hydrophobic h-region and a neutral but polar c-region.[4] The cleavage site of the signal peptide is located in the c-region; however, the degree of conservation of the signal sequence and its cleavage site can vary between different proteins. Some eukaryotic secretory preproteins have a basic residue near the N-terminus followed by a hydrophobic core of 7–13 residues.[5] The (P3, P1) rule states that the amino acids

upstream from the cleavage site at position P3 and P1 must have short side chains and neutral charge to allow correct cleavage.[6,7] The position P1 is highly influential in the cleavage of the signal peptides.[8] A number of c