GetOrganelle: a fast and versatile toolkit for accurate de novo assembly of organelle genomes

  • PDF / 1,998,507 Bytes
  • 31 Pages / 595.276 x 793.701 pts Page_size
  • 97 Downloads / 189 Views

DOWNLOAD

REPORT


SOFTWARE

Open Access

GetOrganelle: a fast and versatile toolkit for accurate de novo assembly of organelle genomes Jian-Jun Jin1†, Wen-Bin Yu 2,3,4†, Jun-Bo Yang1, Yu Song2,3,4, Claude W. dePamphilis5, Ting-Shuang Yi1* De-Zhu Li1* * Correspondence: tingshuangyi@ mail.kib.ac.cn; [email protected] † Jian-Jun Jin and Wen-Bin Yu contributed equally to this work. 1 Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan 650201, China Full list of author information is available at the end of the article

and

Abstract GetOrganelle is a state-of-the-art toolkit to accurately assemble organelle genomes from whole genome sequencing data. It recruits organelle-associated reads using a modified “baiting and iterative mapping” approach, conducts de novo assembly, filters and disentangles the assembly graph, and produces all possible configurations of circular organelle genomes. For 50 published plant datasets, we are able to reassemble the circular plastomes from 47 datasets using GetOrganelle. GetOrganelle assemblies are more accurate than published and/or NOVOPlasty-reassembled plastomes as assessed by mapping. We also assemble complete mitochondrial genomes using GetOrganelle. GetOrganelle is freely released under a GPL-3 license (https://github.com/Kinggerm/GetOrganelle). Keywords: Assembler, Assembly graph, Plastome, Mitogenome, Organelle genome

Background The plastid genome (plastome, including the chloroplast and other plastid forms) and mitochondrial genome (mitogenome or chondriome) represent the portions of endosymbiotic organelle inheritance in eukaryotes that have remained in the organelle without being transferred to the nucleus or lost. The plastomes of photosynthetic eukaryotes are generally 120–150 kb in size and typically map as a highly conserved circular and quadripartite structure, with a pair of inverted repeat regions (IRs) that separate the large single copy (LSC) region from the small single copy (SSC) region [1, 2]. Mitogenomes exist in nearly all eukaryotic organisms and vary greatly in genome size and form. To date, six main types of mitogenome organization have been recognized [3]. Animal mitogenomes map as a single circle molecule, ranging from 11 to 28 kb in size and either lacking introns (i.e., type I) or including introns (types II–VI). Fungi and plants have single circular mitogenomes with introns from 19 to 1000 kb in size (type II), or a large and homogenous circular molecule from 20 to 1000 kb in size with small circular plasmid-like molecules (type III), or homogenous linear molecules © The Author(s). 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the