A comparison of BeadChip and WGS genotyping outputs using partial validation by sanger sequencing

  • PDF / 2,346,597 Bytes
  • 11 Pages / 595.276 x 790.866 pts Page_size
  • 48 Downloads / 166 Views

DOWNLOAD

REPORT


RESEARCH

Open Access

A comparison of BeadChip and WGS genotyping outputs using partial validation by sanger sequencing Kirill A. Danilov1,2*, Dimitri A. Nikogosov1, Sergey V. Musienko1 and Ancha V. Baranova3,4 From 11th International Young Scientists School “Systems Biology and Bioinformatics” – SBB-2019 Novosibirsk, Russia. 24-28 June 2019

Abstract Background: Head-to-head comparison of BeadChip and WGS/WES genotyping techniques for their precision is far from straightforward. A tool for validation of high-throughput genotyping calls such as Sanger sequencing is neither scalable nor practical for large-scale DNA processing. Here we report a cross-validation analysis of genotyping calls obtained via Illumina GSA BeadChip and WGS (Illumina HiSeq X Ten) techniques. Results: When compared to each other, the average precision and accuracy of BeadChip and WGS genotyping techniques exceeded 0.991 and 0.997, respectively. The average fraction of discordant variants for both platforms was found to be 0.639%. A sliding window approach was utilized to explore genomic regions not exceeding 500 bp encompassing a maximal amount of discordant variants for further validation by Sanger sequencing. Notably, 12 variants out of 26 located within eight identified regions were consistently discordant in related calls made by WGS and BeadChip. When Sanger sequenced, a total of 16 of these genotypes were successfully resolved, indicating that a precision of WGS and BeadChip genotyping for this genotype subset was at 0.81 and 0.5, respectively, with accuracy values of 0.87 and 0.61. Conclusions: We conclude that WGS genotype calling exhibits higher overall precision within the selected variety of discordantly genotyped variants, though the amount of validated variants remained insufficient. Keywords: WGS, WES, Whole genome sequencing, Microarray genotyping, Genotype concordance, Sanger sequencing

Background Both Whole Genome (WGS) and Whole Exome sequencing (WES) are now used in multiple avenues of clinical and scientific inquiry. Despite increased availability of these techniques and rapid decline of associated costs, * Correspondence: [email protected] 1 Atlas Biomed Group Limited, Tintagel House, 92 Albert Embankment, Lambeth, London SE1 7TY, UK 2 Skolkovo Institute of Science and Technology, Bolshoy Boulevard 30, bld. 1, 121205 Moscow, Russia Full list of author information is available at the end of the article

their context-dependent per-base performance remained in question. The performance characteristics of WGS/ WES include accuracy (the extent of agreement between the reference and the assay-derived nucleic sequence), precision which is broadly defined as repeatability for within-run precision and reproducibility for betweenrun precision as well as analytical sensitivity, specificity and a reportable range of the reference genome coverage [1]. While the repeatability issues were extensively presented in detail previously [2], the absence of scalable

© The Author(s). 2020 Open Access This article is licensed under a C