SpectralTAD: an R package for defining a hierarchy of topologically associated domains using spectral clustering
- PDF / 1,810,040 Bytes
- 19 Pages / 595.276 x 793.701 pts Page_size
- 50 Downloads / 162 Views
SOFTWARE
Open Access
SpectralTAD: an R package for defining a hierarchy of topologically associated domains using spectral clustering Kellen G. Cresswell†, John C. Stansfield and Mikhail G. Dozmorov*† * Correspondence: mikhail. [email protected] † Kellen G. Cresswell and Mikhail G. Dozmorov contributed equally to this work. Department of Biostatistics, Virginia Commonwealth University, Richmond, VA, USA
Abstract Background: The three-dimensional (3D) structure of the genome plays a crucial role in gene expression regulation. Chromatin conformation capture technologies (Hi-C) have revealed that the genome is organized in a hierarchy of topologically associated domains (TADs), sub-TADs, and chromatin loops. Identifying such hierarchical structures is a critical step in understanding genome regulation. Existing tools for TAD calling are frequently sensitive to biases in Hi-C data, depend on tunable parameters, and are computationally inefficient. Methods: To address these challenges, we developed a novel sliding window-based spectral clustering framework that uses gaps between consecutive eigenvectors for TAD boundary identification. Results: Our method, implemented in an R package, SpectralTAD, detects hierarchical, biologically relevant TADs, has automatic parameter selection, is robust to sequencing depth, resolution, and sparsity of Hi-C data. SpectralTAD outperforms four state-of-theart TAD callers in simulated and experimental settings. We demonstrate that TAD boundaries shared among multiple levels of the TAD hierarchy were more enriched in classical boundary marks and more conserved across cell lines and tissues. In contrast, boundaries of TADs that cannot be split into sub-TADs showed less enrichment and conservation, suggesting their more dynamic role in genome regulation. Conclusion: SpectralTAD is available on Bioconductor, http://bioconductor.org/ packages/SpectralTAD/. Keywords: Hi-C, Chromosome conformation capture, Topologically associated domains, TADs, Hierarchy, SpectralTAD
Background The introduction of chromatin conformation capture technology and its highthroughput derivative Hi-C enabled researchers to accurately model chromatin interactions across the genome and uncover the non-random 3D structures formed by folded genomic DNA [1–3]. The structure and interactions of the DNA in 3D space inside the nucleus has been shown to shape cell type-specific gene expression [3], replication © The Author(s). 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons
Data Loading...