HPC-REDItools: a novel HPC-aware tool for improved large scale RNA-editing analysis
- PDF / 2,478,192 Bytes
- 12 Pages / 595 x 794 pts Page_size
- 38 Downloads / 235 Views
SOFTWARE
Open Access
HPC-REDItools: a novel HPC-aware tool for improved large scale RNA-editing analysis Tiziano Flati1 , Silvia Gioiosa1 , Nicola Spallanzani1 , Ilario Tagliaferri1 , Maria Angela Diroma2 , Graziano Pesole2,3 , Giovanni Chillemi2,4 , Ernesto Picardi2,3* and Tiziana Castrignanò5 From 13th Bioinformatics and Computational Biology Conference - BBCC 2018 Naples, Italy. 19-21 November 2018 *Correspondence: [email protected] 2 Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies (IBIOM), National Research Council, Via Giovanni Amendola, 165/A, Bari 70125, Italy 4 Department for Innovation in Biological, Agro-food and Forest systems (DIBAF), University of Tuscia, Via S. Camillo de Lellis snc, 01100 Viterbo, Italy Full list of author information is available at the end of the article
Abstract Background: RNA editing is a widespread co-/post-transcriptional mechanism that alters primary RNA sequences through the modification of specific nucleotides and it can increase both the transcriptome and proteome diversity. The automatic detection of RNA-editing from RNA-seq data is computational intensive and limited to small data sets, thus preventing a reliable genome-wide characterisation of such process. Results: In this work we introduce HPC-REDItools, an upgraded tool for accurate RNA-editing events discovery from large dataset repositories. Availability: https:// github.com/BioinfoUNIBA/REDItools2. Conclusions: HPC-REDItools is dramatically faster than the previous version, REDItools, enabling big-data analysis by means of a MPI-based implementation and scaling almost linearly with the number of available cores. Keywords: RNA-editing, High Performance Computing, REDItools, Next Generation Sequencing, Sequence Analysis, Bioinformatics pipeline
Background Advances in next generation sequencing (NGS) technologies have led to the production of an unprecedented amount of omic data (including genomes, transcriptomes, epigenomes from cells, tissues and organisms) changing science and medicine in ways never seen before and entering the “big data” era. The scale and efficiency of NGS poses the relevant challenges of sharing, archiving, integrating and analyzing these vast collections of omic data. Although tools and algorithms to handle and analyse large NGS datasets are now appearing, widespread software for common NGS analyses are yet multi-threaded or serial and not ready for the “big data” era revolution. As a consequence, the investigation
© The Author(s). 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material
Data Loading...