Sequencing of E. coli strain UTI89 on multiple sequencing platforms

  • PDF / 1,031,196 Bytes
  • 4 Pages / 595.276 x 790.866 pts Page_size
  • 105 Downloads / 228 Views

DOWNLOAD

REPORT


BMC Research Notes Open Access

DATA NOTE

Sequencing of E. coli strain UTI89 on multiple sequencing platforms Shannon N. Fenlon1, Yuemin Celina Chee2, Jacqueline Lai Yuen Chee1, Yeen Hui Choy1, Alexis Jiaying Khng1, Lu Ting Liow2, Kurosh S. Mehershahi2, Xiaoan Ruan1, Stephen W. Turner3, Fei Yao1 and Swaine L. Chen1,2* 

Abstract  Objectives:  The availability of matched sequencing data for the same sample across different sequencing platforms is a necessity for validation and effective comparison of sequencing platforms. A commonly sequenced sample is the lab-adapted MG1655 strain of Escherichia coli; however, this strain is not fully representative of more complex and dynamic genomes of pathogenic E. coli strains. Data description:  We present six new sequencing data sets for another E. coli strain, UTI89, which is an extraintestinal pathogenic strain isolated from a patient suffering from a urinary tract infection. We now provide matched whole genome sequencing data generated using the PacBio RSII, Oxford Nanopore MinION R9.4, Ion Torrent, ABI SOLiD, and Illumina NextSeq sequencers. Together with other publically available datasets, UTI89 has a nearly complete suite of data generated on most second- and third-generation sequencers. These data can be used as an additional validation set for new sequencing technologies and analytical methods. More than being another E. coli strain, however, UTI89 is pathogenic, with a 10% larger genome, additional pathogenicity islands, and a large plasmid, features that are common among other naturally occurring and disease-causing E. coli isolates. These data therefore provide a more medically relevant test set for development of algorithms. Keywords:  Escherichia coli, UPEC, Urinary Tract Infection (UTI), Ion Torrent, SOLiD, Illumina, Oxford Nanopore, MinION, PacBio, Roche454 Objective Control sequencing data across different sequencing platforms is extremely important for validation and effective comparison of sequencing platforms. A commonly sequenced sample that has been extensively used for these purposes is the MG1655 strain of E. coli [1]. However, the MG1655 genome is smaller and less complex than those of some pathogenic E. coli strains [2, 3]. As part of control experiments, we have sequenced UTI89, a uropathogenic E. coli (UPEC) strain originally isolated from a patient suffering from an acute bladder *Correspondence: [email protected]‑star.edu.sg 1 Genome Institute of Singapore, 60 Biopolis Street, Genome, #02‑01, Singapore 138672, Singapore Full list of author information is available at the end of the article

infection [4], using several different sequencing technologies, including ABI SOLiD, Ion Torrent, PacBio, Oxford Nanopore, and Illumina. Our new data supplements previously published sequencing data generated using the Roche 454 [4], Illumina HiSeq [5], and the original Oxford Nanopore Technologies MinION [6]. With the inclusion of these new data sets, E. coli strain UTI89 now has a nearly complete set of raw sequence data generated using most second- and third-gene