Sequencing of animal viruses: quality data assurance for NGS bioinformatics

  • PDF / 1,201,432 Bytes
  • 13 Pages / 595.276 x 790.866 pts Page_size
  • 18 Downloads / 191 Views

DOWNLOAD

REPORT


RESEARCH

Open Access

Sequencing of animal viruses: quality data assurance for NGS bioinformatics Gianpiero Zamperin1, Pierrick Lucas2,3, Irene Cano4, David Ryder4, Miriam Abbadi1, David Stone4, Argelia Cuenca5, Estelle Vigouroux3,6, Yannick Blanchard2,3* and Valentina Panzarin1*

Abstract Background: Next generation sequencing (NGS) is becoming widely used among diagnostics and research laboratories, and nowadays it is applied to a variety of disciplines, including veterinary virology. The NGS workflow comprises several steps, namely sample processing, library preparation, sequencing and primary/secondary/tertiary bioinformatics (BI) analyses. The latter is constituted by a complex process extremely difficult to standardize, due to the variety of tools and metrics available. Thus, it is of the utmost importance to assess the comparability of results obtained through different methods and in different laboratories. To achieve this goal, we have organized a proficiency test focused on the bioinformatics components for the generation of complete genome sequences of salmonid rhabdoviruses. Methods: Three partners, that performed virus sequencing using different commercial library preparation kits and NGS platforms, gathered together and shared with each other 75 raw datasets which were analyzed separately by the participants to produce a consensus sequence according to their own bioinformatics pipeline. Results were then compared to highlight discrepancies, and a subset of inconsistencies were investigated more in detail. Results: In total, we observed 526 discrepancies, of which 39.5% were located at genome termini, 14.1% at intergenic regions and 46.4% at coding regions. Among these, 10 SNPs and 99 indels caused changes in the protein products. Overall reproducibility was 99.94%. Based on the analysis of a subset of inconsistencies investigated more in-depth, manual curation appeared the most critical step affecting sequence comparability, suggesting that the harmonization of this phase is crucial to obtain comparable results. The analysis of a calibrator sample allowed assessing BI accuracy, being 99.983%. Conclusions: We demonstrated the applicability and the usefulness of BI proficiency testing to assure the quality of NGS data, and recommend a wider implementation of such exercises to guarantee sequence data uniformity among different virology laboratories. Keywords: NGS, Bioinformatics, Proficiency testing, Virology

Background In veterinary medicine, diagnosis, monitoring and prevention of infectious diseases can no longer neglect to perform an accurate genetic characterization of their causative agents [1]. In fact, the number of molecular markers of * Correspondence: [email protected]; [email protected] 2 French Agency for Food, Environmental and Occupational Health & Safety (ANSES), Ploufragan-Plouzané-Niort Laboratory, Viral Genetics and Biosecurity Unit, 22440 Ploufragan, France 1 Department of Comparative Biomedical Sciences, Istituto Zooprofilattico Sperimentale delle Venezie (IZSVe), vi