Recent advances on the machine learning methods in predicting ncRNA-protein interactions
- PDF / 2,175,894 Bytes
- 16 Pages / 595.276 x 790.866 pts Page_size
- 102 Downloads / 138 Views
REVIEW
Recent advances on the machine learning methods in predicting ncRNA‑protein interactions Lin Zhong1 · Meiqin Zhen2 · Jianqiang Sun3 · Qi Zhao4 Received: 1 June 2020 / Accepted: 17 September 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020
Abstract Recent transcriptomics and bioinformatics studies have shown that ncRNAs can affect chromosome structure and gene transcription, participate in the epigenetic regulation, and take part in diseases such as tumorigenesis. Biologists have found that most ncRNAs usually work by interacting with the corresponding RNA-binding proteins. Therefore, ncRNA-protein interaction is a very popular study in both the biological and medical fields. However, due to the limitations of manual experiments in the laboratory, machine-learning methods for predicting ncRNA-protein interactions are increasingly favored by the researchers. In this review, we summarize several machine learning predictive models of ncRNA-protein interactions over the past few years, and briefly describe the characteristics of these machine learning models. In order to optimize the performance of machine learning models to better predict ncRNA-protein interactions, we give some promising future computational directions at the end. Keywords ncRNA · Protein · ncRNA-protein interaction · Machine learning methods · Predictive models Abbreviations ncRNA Non-coding RNA rRNA Ribosomal RNA tRNA Transfer RNA miRNA MicroRNA snRNA Small nuclear RNA lncRNAs Long non coding RNAs SVM Support vector machine RF Random forest LOOCV Leave-one-out cross validation K-CV K-Fold Cross Validation Lin Zhong and Meiqin Zhen contributed equally to this work. * Qi Zhao [email protected] 1
School of Mathematics, Liaoning University, Shenyang 110036, China
2
Beijing Chest Hospital, Capital Medical University/Beijing Tuberculosis and Thoracic Tumor Research Institute, Beijing 101149, China
3
School of Automation and Electrical Engineering, Linyi University, Linyi 276000, China
4
School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan 114051, China
ROC Receiver operator characteristics AUC The area under ROC curve PPSNs Protein–protein similarity networks SNF Similarity network fusion XGB Extreme gradient enhancement SAN Stacking autoencoder networks PSSM Position-specific scoring matrix SVD Singular value decomposition PZM Pseudo-Zernike moment PCC Pearson correlation coefficient GBDT Gradient boosting decision tree Extra tree Extremely randomized trees LMs Legendre moments PWM Position weight matrix
Introduction Since Andrew Fall and Craig Mello won the Nobel Prize for “RNA Interference: Gene Silencing by Double-Stranded RNA” in 2006, RNA has received extensive attention from more and more researchers and it has been one of the most popular areas of life science research for nearly a decade. RNA is generally divided into coding RNA and non-coding RNA (ncRNA) according to whether it codes for protein or not. NcRNA is an RNA th
Data Loading...