A multi-omics supervised autoencoder for pan-cancer clinical outcome endpoints prediction

  • PDF / 1,296,971 Bytes
  • 9 Pages / 595.276 x 790.866 pts Page_size
  • 4 Downloads / 128 Views

DOWNLOAD

REPORT


RESEARCH

Open Access

A multi-omics supervised autoencoder for pan-cancer clinical outcome endpoints prediction Kaiwen Tan, Weixian Huang, Jinlong Hu and Shoubin Dong* From 5th China Health Information Processing Conference Guangzhou, China. 22-24 November 2019

Abstract Background: With the rapid development of sequencing technologies, collecting diverse types of cancer omics data become more cost-effective. Many computational methods attempted to represent and fuse multiple omics into a comprehensive view of cancer. However, different types of omics are related and heterogeneous. Most of the existing methods do not consider the difference between omics, so the biological knowledge of individual omics may not be fully excavated. And for a given task (e.g. predicting overall survival), these methods prefer to use sample similarity or domain knowledge to learn a more reasonable representation of omics, but it’s not enough. Methods: For the purpose of learning more useful representation for individual omics and fusing them to improve the prediction ability, we proposed an autoencoder-based method named MOSAE (Multi-omics Supervised Autoencoder). In our method, a specific autoencoder were designed for each omics according to their size of dimension to generate omics-specific representations. Then, a supervised autoencoder was constructed based on specific autoencoder by using labels to enforce each specific autoencoder to learn both omics-specific and taskspecific representations. Finally, representations of different omics that generate from supervised autoencoders were fused in a traditional but powerful way, and the fused representation was used for subsequent predictive tasks. Results: We applied our method over TCGA Pan-Cancer dataset to predict four different clinical outcome endpoints (OS, PFI, DFI, and DSS). Compared with traditional and state-of-the-art methods, MOSAE achieved better predictive performance. We also tested the effects of each improvement, which all have a positive effect on predictive performance. Conclusions: Predicting clinical outcome endpoints are very important for precision medicine and personalized medicine. And multi-omics fusion is an effective way to solve this problem. MOSAE is a powerful multi-omics fusion method, which can generate both omics-specific and task-specific representation for given endpoint predictive tasks and improve the predictive performance. Keywords: Multic-omics, Autoencoder, Fusion, Representation, Pan-Cancer, Endpoints

* Correspondence: [email protected] Communication & Computer Network Lab of Guangdong, School of Computer Science & Engineering, South China University of Technology, Wushan Road, Guangzhou 381, China © The Author(s). 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence,