A machine learning framework for accurately recognizing circular RNAs for clinical decision-supporting

  • PDF / 1,239,615 Bytes
  • 12 Pages / 595.276 x 790.866 pts Page_size
  • 82 Downloads / 170 Views

DOWNLOAD

REPORT


RESEARCH

Open Access

A machine learning framework for accurately recognizing circular RNAs for clinical decision-supporting Yidan Wang1†, Xuanping Zhang1†, Tao Wang2†, Jinchun Xing2, Zhun Wu2, Wei Li2* and Jiayin Wang1* From 5th China Health Information Processing Conference Guangzhou, China. 22-24 November 2019

Abstract Background: Circular RNAs (circRNAs) are those RNA molecules that lack the poly (A) tails, which present the closed-loop structure. Recent studies emphasized that some circRNAs imply different functions from canonical transcripts, and further associated with complex diseases. Several computational methods have been developed for detecting circRNAs from RNA-seq data. However, the existing methods prefer to high sensitivity strategies, which always introduce many false positives. Thus, in clinical decision-supporting system, a comprehensive filtering approach is needed for accurately recognizing real circRNAs for decision models. Methods: In this paper, we first reviewed the detection strategies of the existing methods. According to the features from RNA-seq data, we showed that any single feature (data signal) selected by the existing strategies cannot accurately distinguish a circRNA. However, we found that some combinations of those features (data signals) could be used as signatures for recognizing circRNAs. To avoid the high computational complexity of the combinational optimization problem, we present CIRCPlus2, which adopts a machine learning framework to recognize real circRNAs according to multiple data signals captured from RNA-seq data. By comparing multiple machine learning frameworks, CIRCPlus2 adopts a Gradient Boosting Decision Tree (GBDT) framework. (Continued on next page)

* Correspondence: [email protected]; [email protected] † Yidan Wang, Xuanping Zhang and Tao Wang contributed equally to this work. 2 The Key Laboratory of Urinary Tract Tumors and Calculi, Department of Urology Surgery, The First Affiliated Hospital, School of Medicine, Xiamen University, Xiamen, China 1 School of Computer Science and Technology, Shaanxi Engineering Research Center of Medical and Health Big Data, Xi’an Jiaotong University, Xi’an, China

© The Author(s). 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/license