New Reduction Rules for the Tree Bisection and Reconnection Distance
- PDF / 683,932 Bytes
- 28 Pages / 439.37 x 666.142 pts Page_size
- 114 Downloads / 196 Views
Annals of Combinatorics
New Reduction Rules for the Tree Bisection and Reconnection Distance Steven Kelk and Simone Linz Abstract. Recently it was shown that, if the subtree and chain reduction rules have been applied exhaustively to two unrooted phylogenetic trees, the reduced trees will have at most 15k − 9 taxa where k is the TBR (Tree Bisection and Reconnection) distance between the two trees, and that this bound is tight. Here, we propose five new reduction rules and show that these further reduce the bound to 11k − 9. The new rules combine the “unrooted generator” approach introduced in Kelk and Linz (SIAM J Discrete Math 33(3):1556–1574, 2019) with a careful analysis of agreement forests to identify (i) situations when chains of length 3 can be further shortened without reducing the TBR distance, and (ii) situations when small subtrees can be identified whose deletion is guaranteed to reduce the TBR distance by 1. To the best of our knowledge these are the first reduction rules that strictly enhance the reductive power of the subtree and chain reduction rules. Keywords. Fixed-parameter tractability, Tree bisection and reconnection, Generator, Kernelization, Agreement forest, Phylogenetic network, Phylogenetic tree, Hybridization number.
1. Introduction A phylogenetic tree is a tree whose leaves are bijectively labelled by a set of species (or, more generically, a set of taxa) X [13]. These trees are ubiquitous in the systematic study of evolution: the leaves represent contemporary species and the internal vertices of the tree represent hypothetical common ancestors. Over the years many techniques have been developed for inferring phylogenetic trees from (incomplete) biological data and under a range of different objective functions [7]. Here we are not concerned with inferring phylogenetic trees, but rather with quantifying the “distance” between two phylogenetic trees. Such a goal is well-motivated, since different methodologies for inferring phylogenetic 0123456789().: V,-vol
S. Kelk, S. Linz
trees sometimes yield trees with differing topologies, and reticulate evolutionary phenomena such as hybridization can cause different genes in the same genome to have different evolutionary histories [11]. We focus on the Tree Bisection and Reconnection (TBR) distance, which is NP-hard to compute [1,10]. Informally, the TBR distance between two trees T and T , denoted dTBR (T, T ), is the minimum number of topological rearrangement moves that need to be applied to transform T into T , where such a move involves detaching a subtree and attaching it elsewhere. It was proven in 2001 [1] that the question “Is dTBR (T, T ) ≤ k?” can be answered in time f (k) · poly(|X|), where f is a computable function that depends only on k. In other words: the problem is fixed parameter tractable [6]. Specifically, the authors proved that the two polynomial-time subtree and chain reduction rules preserve the TBR distance and reduce the number of taxa to at most 28 · dTBR (T, T ) for any two unrooted phylogenetic trees T and T .
Data Loading...