Self-evoluting framework of deep convolutional neural network for multilocus protein subcellular localization

  • PDF / 4,117,501 Bytes
  • 22 Pages / 595.276 x 790.866 pts Page_size
  • 76 Downloads / 156 Views

DOWNLOAD

REPORT


ORIGINAL ARTICLE

Self-evoluting framework of deep convolutional neural network for multilocus protein subcellular localization Hanhan Cong 1,2 & Hong Liu 1,2 & Yuehui Chen 3,4 & Yi Cao 3,4 Received: 2 October 2019 / Accepted: 14 October 2020 # International Federation for Medical and Biological Engineering 2020

Abstract In the present paper, deep convolutional neural network (DCNN) is applied to multilocus protein subcellular localization as it is more suitable for multi-class classification. There are two main problems with this application. First, the appropriate features for correlation between multiple sites are hard to find. Second, the classifier structure is difficult to determine as it is greatly affected by the distribution of classified data. To solve these problems, a self-evoluting framework using DCNNs for multilocus protein subcellular localization is proposed. It has three characteristics that the previous algorithms do not. The first is that it combines the ant colony algorithm with the DCNN to form a self-evoluting algorithm for multilocus protein subcellular localization. The second is that it randomly groups subcellular sites using a limited random k-labelsets multi-label classification method. It also solves complex problems in a divide-and-conquer approach and proposes a flexible expansion model. The third is that it realizes the random selection feature extraction method in the positioning process and avoids the defects in individual feature extraction methods. The algorithm in the present paper is tested on the human database, and the overall correct rate is 67.17%, which is higher than that for the stacked self-encoder (SAE), support vector machine (SVM), random forest classifier (RF), or single deep convolutional neural network. Keywords Multilocus protein subcellular localization . Deep convolutional neural network . Ant colony algorithm . Random k-labelsets

1 Introduction * Hong Liu [email protected] Hanhan Cong [email protected] Yuehui Chen [email protected] Yi Cao [email protected] 1

School of Information Science and Engineering, Shandong Normal University, No. 88, Wenhua East Road, Jinan City, China

2

Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Shandong Normal University, Jinan, China

3

School of Information Science and Engineering, University of Jinan, Jinan, China

4

Shandong Provincial Key Laboratory of Network Based Intelligent Computing, University of Jinan, Jinan, China

Protein is essential in the operations of highly ordered cell systems. It can work properly only at specific structures in cells, called subcellular locations, where it can provide required chemical environments and components [1, 2]. Therefore, protein subcellular localization is important in life science and bioinformatics [3]. Accurate predictions of protein subcellular localizations are of great significance in pathogenesis analysis, drug design, and disease discovery [4, 5]. The traditional study of protein subcellular localization relies on experimental