GLaMST: grow lineages along minimum spanning tree for b cell receptor sequencing data
- PDF / 1,969,462 Bytes
- 11 Pages / 595 x 791 pts Page_size
- 97 Downloads / 224 Views
RESEARCH
Open Access
GLaMST: grow lineages along minimum spanning tree for b cell receptor sequencing data Xingyu Yang1 , Christopher M. Tipton2 , Matthew C. Woodruff2 , Enlu Zhou3 , F. Eun-Hyung Lee4 , Inãki Sanz2 and Peng Qiu5* From The Sixth International Workshop on Computational Network Biology: Modeling, Analysis, and Control (CNB-MAC 2019) Niagara Falls, NY, USA. 07 September 2019
Abstract Background: B cell affinity maturation enables B cells to generate high-affinity antibodies. This process involves somatic hypermutation of B cell immunoglobulin receptor (BCR) genes and selection by their ability to bind antigens. Lineage trees are used to describe this microevolution of B cell immunoglobulin genes. In a lineage tree, each node is one BCR sequence that mutated from the germinal center and each directed edge represents a single base mutation, insertion or deletion. In BCR sequencing data, the observed data only contains a subset of BCR sequences in this microevolution process. Therefore, reconstructing the lineage tree from experimental data requires algorithms to build the tree based on partially observed tree nodes. Results: We developed a new algorithm named Grow Lineages along Minimum Spanning Tree (GLaMST), which efficiently reconstruct the lineage tree given observed BCR sequences that correspond to a subset of the tree nodes. Through comparison using simulated and real data, GLaMST outperforms existing algorithms in simulations with high rates of mutation, insertion and deletion, and generates lineage trees with smaller size and closer to ground truth according to tree features that highly correlated with selection pressure. Conclusions: GLaMST outperforms state-of-art in reconstruction of the BCR lineage tree in both efficiency and accuracy. Integrating it into existing BCR sequencing analysis frameworks can significant improve lineage tree reconstruction aspect of the analysis. Keywords: B cell receptor gene, Lineage tree Background To specifically recognize and respond to different pathogens, adaptive immune system relies on a diverse repertoire of B cell immunoglobulin receptors (BCR). Such a diverse repertoire comes from recombination, somatic hypermutation of immunoglobulin (Ig) gene *Correspondence: [email protected] Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, USA Full list of author information is available at the end of the article 5
segments, and selection by their ability to bind pathogens. This process will generate numerous different BCRs. To explore the dynamic process of BCR affinity maturation, researchers have applied high throughput sequencing [1–3] to examine BCR repertoires and to construct lineage trees of BCR sequences [4, 5]. In a BCR lineage tree, each tree node corresponds to one unique sequence, and each directed edge indicates the relationship between one sequence and its immediate ancestor, which are separated by one-base muta-
© The Author(s). 2020 Open Access This article is licensed under a Creative Co
Data Loading...