annoFuse: an R Package to annotate, prioritize, and interactively explore putative oncogenic RNA fusions

  • PDF / 1,951,107 Bytes
  • 21 Pages / 595.276 x 790.866 pts Page_size
  • 7 Downloads / 182 Views

DOWNLOAD

REPORT


SOFTWARE

Open Access

annoFuse: an R Package to annotate, prioritize, and interactively explore putative oncogenic RNA fusions Krutika S. Gaonkar1,2,3†  , Federico Marini4,5†  , Komal S. Rathi1,2,3  , Payal Jain1,3  , Yuankun Zhu1,3  , Nicholas A. Chimicles1,2, Miguel A. Brown1,3  , Ammar S. Naqvi1,2,3  , Bo Zhang1,3  , Phillip B. Storm1,3  , John M. Maris6  , Pichai Raman1,2,3  , Adam C. Resnick1,2,3  , Konstantin Strauch4, Jaclyn N. Taroni7  and Jo Lynne Rokita1,2,3* 

*Correspondence: [email protected] † Krutika S. Gaonkar and Federico Marini have contributed equally 1 Center for Data‑Driven Discovery in Biomedicine, Children’s Hospital of Philadelphia, Philadelphia, PA, USA Full list of author information is available at the end of the article

Abstract  Background:  Gene fusion events are significant sources of somatic variation across adult and pediatric cancers and are some of the most clinically-effective therapeutic targets, yet low consensus of RNA-Seq fusion prediction algorithms makes therapeutic prioritization difficult. In addition, events such as polymerase read-throughs, mis-mapping due to gene homology, and fusions occurring in healthy normal tissue require informed filtering, making it difficult for researchers and clinicians to rapidly discern gene fusions that might be true underlying oncogenic drivers of a tumor and in some cases, appropriate targets for therapy. Results:  We developed annoFuse, an R package, and shinyFuse, a companion web application, to annotate, prioritize, and explore biologically-relevant expressed gene fusions, downstream of fusion calling. We validated annoFuse using a random cohort of TCGA RNA-Seq samples (N = 160) and achieved a 96% sensitivity for retention of highconfidence fusions (N = 603). annoFuse uses FusionAnnotator annotations to filter nononcogenic and/or artifactual fusions. Then, fusions are prioritized if previously reported in TCGA and/or fusions containing gene partners that are known oncogenes, tumor suppressor genes, COSMIC genes, and/or transcription factors. We applied annoFuse to fusion calls from pediatric brain tumor RNA-Seq samples (N = 1028) provided as part of the Open Pediatric Brain Tumor Atlas (OpenPBTA) Project to determine recurrent fusions and recurrently-fused genes within different brain tumor histologies. annoFuse annotates protein domains using the PFAM database, assesses reciprocality, and annotates gene partners for kinase domain retention. As a standard function, reportFuse enables generation of a reproducible R Markdown report to summarize filtered fusions, visualize breakpoints and protein domains by transcript, and plot recurrent fusions within cohorts. Finally, we created shinyFuse for algorithm-agnostic interactive exploration and plotting of gene fusions. Conclusions:  annoFuse provides standardized filtering and annotation for gene fusion calls from STAR-Fusion and Arriba by merging, filtering, and prioritizing putative oncogenic fusions across large cancer datasets, as demonstrated here with data from © The Author(s) 202