Correction to: IonCRAM: a reference-based compression tool for ion torrent sequence files

  • PDF / 809,828 Bytes
  • 2 Pages / 595.276 x 790.866 pts Page_size
  • 51 Downloads / 164 Views

DOWNLOAD

REPORT


2020) 21:435

Open Access

CORRECTION

Correction to: IonCRAM: a reference‑based compression tool for ion torrent sequence files Moustafa Shokrof1 and Mohamed Abouelhoda2,3,4* 

The original article can be found online at https​://doi. org/10.1186/s1285​9-02003726​-9. *Correspondence: [email protected] 4 Systems and Biomedical Engineering Department, Faculty of Engineering, Cairo University, University Square, Giza, Egypt Full list of author information is available at the end of the article

Correction to: BMC Bioinformatics (2020) 21:397 https​://doi.org/10.1186/s1285​9-020-03726​-9 Following publication of the original article [1], the authors identified a missing section in the published article. The missing section is given below. Algorithm IonCRAM-CompressBAM 1 Sort the BAM file (if not sorted) by genomic coordinates. Then sort the reads starting at the same locus lexicographically via sorting their CIGAR string. 2 Separate the signals of the forward reads from those of the reverse ones and process each group independently (in parallel) using Steps 3 and 4. 3 Remove the flow signals from the BAM file and store them separately. Compress the remaining fields of the BAM file (sequence, quality, and other fields) using a reference based method (We use the program Scramble [20] for this step.) 4 Define blocks of flow signals, such that the reads in each block are mapped to the same locus. Each block B can be processed in parallel using the steps 4.1 to 4.3: 4.1 Let Fr denote the rth flow-signal vector in B (r ∈ [1..m]), and let Fr[i] ∈ ℤ denote the ith component of it, 1 ≤ i ≤ n. Take F1 as a reference vector and compute the difference vector Dj, where ­Dj[i] = Fj[i] – Fj+1 [i], 1 ≤ i ≤ n, 1 ≤ j  255 for any x, then we write 255 in Vj1[x] and write the values (Dj[x]-255) in a separate list Vj2. (The length of Vj2 list equals the number of values larger than 255 in Dj and they are very rare in practice.) 6.1 Concatenate F1 and the V vectors and compress them. (We use the XZ algorithm as default method for that purpose.) (F1 is a reference flow signal vector that will be used in decompression). 5 Wait until all (parallel) processes finish. Use the Linux tar package to create a compressed folder including the compressed B blocks files and the other compressed CRAM files computed in Step 1.

© The Author(s) 2020. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permis