Resolving misalignment interference for NGS-based clinical diagnostics

  • PDF / 1,442,864 Bytes
  • 16 Pages / 595.276 x 790.866 pts Page_size
  • 22 Downloads / 216 Views

DOWNLOAD

REPORT


ORIGINAL INVESTIGATION

Resolving misalignment interference for NGS‑based clinical diagnostics Che‑yu Lee1 · Hai‑Yun Yen1 · Alan W. Zhong1 · Hanlin Gao1 Received: 15 February 2020 / Accepted: 31 July 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020

Abstract Next-generation sequencing (NGS) is an incredibly useful tool for genetic disease diagnosis. However, the most commonly used bioinformatics methods for analyzing sequence reads insufficiently discriminate genomic regions with extensive sequence identity, such as gene families and pseudogenes, complicating diagnostics. This problem has been recognized for specific genes, including many involved in human disease, and diagnostic labs must perform additional costly steps to guarantee accurate diagnosis in these cases. Here we report a new data analysis method based on the comparison of read depth between highly homologous regions to identify misalignment. Analyzing six clinically important genes—CYP21A2, GBA, HBA1/2, PMS2, and SMN1—each exhibiting misalignment issues related to homology, we show that our technique can correctly identify potential misalignment events and be used to make appropriate calls. Combined with long-range PCR and/ or MLPA orthogonal testing, our clinical laboratory can improve variant calling with minimal additional cost. We propose an accurate and cost-efficient NGS testing procedure that will benefit disease diagnostics, carrier screening, and researchbased population studies.

Introduction High throughput, good sensitivity, and decreasing cost have established next-generation sequencing (NGS) as a widely useful method for identifying genomic variants in clinical diagnostics. However, researchers continue to face major challenges related to the high level of similarity between genes and pseudogenes (or homologous genes/regions). Pseudogenes are defined here as non-functional segments of DNA that resemble functional genes. In this study, HBA2 and SMN2 are homologous genes, whereas CYP21A1P, GBAP1, and PMS2CL are pseudogenes. Gene duplication is a common event and a fundamental force driving the evolution of novel gene functions while preserving the original gene and its associated mechanisms; this process produces both pseudogenes and highly homologous genes (Innan Che-yu Lee and Hai-Yun Yen contributed equally to this work. Electronic supplementary material  The online version of this article (https​://doi.org/10.1007/s0043​9-020-02216​-5) contains supplementary material, which is available to authorized users. * Che‑yu Lee [email protected] 1



Fulgent Genetics, Temple City, CA 91780, USA

and Kondrashov 2010; Magadum et al. 2013). Among the 4800 + disease-causing genes in Illumina’s clinical exome capture set, approximately 10% have either pseudogenes or partial segmental duplication issues (Bailey et al. 2001; Mandelker et  al. 2016). With traditional genetic testing methods such as DNA microarray or Sanger sequencing, limitations of probe/primer design often make the capture or amplification of both the target gen