Increased Sensitivity of Diagnostic Mutation Detection by Re-analysis Incorporating Local Reassembly of Sequence Reads

  • PDF / 608,742 Bytes
  • 8 Pages / 595.276 x 790.866 pts Page_size
  • 99 Downloads / 201 Views

DOWNLOAD

REPORT


ORIGINAL RESEARCH ARTICLE

Increased Sensitivity of Diagnostic Mutation Detection by Re-analysis Incorporating Local Reassembly of Sequence Reads Christopher M. Watson1,2,3 • Nick Camm1 • Laura A. Crinnion1,2 • Samuel Clokie4 • Rachel L. Robinson1 Julian Adlard1 • Ruth Charlton1 • Alexander F. Markham2,3 • Ian M. Carr2,3 • David T. Bonthron1,2,3



Ó Springer International Publishing AG 2017

Abstract Background Diagnostic genetic testing programmes based on next-generation DNA sequencing have resulted in the accrual of large datasets of targeted raw sequence data. Most diagnostic laboratories process these data through an automated variant-calling pipeline. Validation of the chosen analytical methods typically depends on confirming the detection of known sequence variants. Despite improvements in short-read alignment methods, current pipelines are known to be comparatively poor at detecting large insertion/deletion mutations.

Electronic supplementary material The online version of this article (doi:10.1007/s40291-017-0304-x) contains supplementary material, which is available to authorized users.

Methods We performed clinical validation of a local reassembly tool, ABRA (assembly-based realigner), through retrospective reanalysis of a cohort of more than 2000 hereditary cancer cases. Results ABRA enabled detection of a 96-bp deletion, 4-bp insertion mutation in PMS2 that had been initially identified using a comparative read-depth approach. We applied an updated pipeline incorporating ABRA to the entire cohort of 2000 cases and identified one previously undetected pathogenic variant, a 23-bp duplication in PTEN. We demonstrate the effect of read length on the ability to detect insertion/deletion variants by comparing HiSeq2500 (2 9 101-bp) and NextSeq500 (2 9 151-bp) sequence data for a range of variants and thereby show that the limitations of shorter read lengths can be mitigated using appropriate informatics tools. Conclusions This work highlights the need for ongoing development of diagnostic pipelines to maximize test sensitivity. We also draw attention to the large differences in computational infrastructure required to perform day-today versus large-scale reprocessing tasks.

& Christopher M. Watson [email protected] 1

Yorkshire Regional Genetics Service, St. James’s University Hospital, 6.2 Clinical Sciences Building, Leeds LS9 7TF, United Kingdom

2

MRC Single Cell Functional Genomics Centre, St. James’s University Hospital, University of Leeds, Leeds LS9 7TF, United Kingdom

3

MRC Medical Bioinformatics Centre, Leeds Institute for Data Analytics, University of Leeds, Leeds LS2 9JT, United Kingdom

4

West Midlands Regional Genetics Laboratory, Birmingham Women’s NHS Foundation Trust, Birmingham B15 2TG, United Kingdom

Key Points We demonstrate how reprocessing legacy datasets using improved bioinformatics tools can increase diagnostic test sensitivity and show how variant detection is affected by sequencing read lengths. We describe the importance of this approach and highlight the computation