Long-read individual-molecule sequencing reveals CRISPR-induced genetic heterogeneity in human ESCs
- PDF / 1,603,717 Bytes
- 14 Pages / 595.276 x 793.701 pts Page_size
- 30 Downloads / 158 Views
METHOD
Open Access
Long-read individual-molecule sequencing reveals CRISPR-induced genetic heterogeneity in human ESCs Chongwei Bi1†, Lin Wang1,2†, Baolei Yuan1, Xuan Zhou1, Yu Li3, Sheng Wang3, Yuhong Pang4, Xin Gao3, Yanyi Huang4,5* and Mo Li1* * Correspondence: [email protected]. cn; [email protected] † Chongwei Bi and Lin Wang are cofirst authors. 4 Beijing Advanced Innovation Center for Genomics (ICG), Biomedical Pioneering Innovation Center (BIOPIC), School of Life Sciences, College of Chemistry, College of Engineering, Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China 1 Laboratory of Stem Cell and Regeneration, Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia Full list of author information is available at the end of the article
Abstract Quantifying the genetic heterogeneity of a cell population is essential to understanding of biological systems. We develop a universal method to label individual DNA molecules for single-base-resolution haplotype-resolved quantitative characterization of diverse types of rare variants, with frequency as low as 4 × 10−5, using both short- or long-read sequencing platforms. It provides the first quantitative evidence of persistent nonrandom large structural variants and an increase in singlenucleotide variants at the on-target locus following repair of double-strand breaks induced by CRISPR-Cas9 in human embryonic stem cells. Keywords: Human embryonic stem cell, CRISPR-Cas9, Genome editing, Nanopore sequencing, Long-read sequencing, Next-generation sequencing, Somatic mutation, Structural variant
Background Molecular consensus sequencing has been developed to enhance the accuracy of shortread next-generation sequencing (NGS) using unique molecular identifier (UMI) [1–3]. The use of UMI combined with bioinformatics enables the correction of random errors introduced by sequencing chemistry or detection. However, it remains challenging to analyze various types of genetic variants, because current methods are inadequate for detecting rare and/or complex variants (Additional file 1: Fig. S1). A case in point is the recent revelation that genome editing by CRISPR-Cas9 can lead to large deletions and complex rearrangements in various cell types, including mouse embryonic stem cells (mESCs) [4, 5]. It is unclear if this phenomenon also happens in human ESCs (hESCs) with identical characteristics, and more importantly, an unbiased and quantitative characterization of CRISPR-induced mutagenesis is still lacking due to limitation of current strategies. Single molecule sequencing technologies can better resolve complex genetic variants by providing long reads [6], but they have a lower raw read accuracy [3]. To overcome © The Author(s). 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as yo
Data Loading...