Exact median-tree inference for unrooted reconciliation costs

  • PDF / 1,853,242 Bytes
  • 15 Pages / 595 x 791 pts Page_size
  • 68 Downloads / 190 Views

DOWNLOAD

REPORT


RESEARCH

Open Access

Exact median-tree inference for unrooted reconciliation costs Paweł Górecki1* , Alexey Markin2 and Oliver Eulenstein2 From The 18th Asia Pacific Bioinformatics Conference Seoul, Korea. 18-20 August 2020

Abstract Background: Solving median tree problems under tree reconciliation costs is a classic and well-studied approach for inferring species trees from collections of discordant gene trees. These problems are NP-hard, and therefore are, in practice, typically addressed by local search heuristics. So far, however, such heuristics lack any provable correctness or precision. Further, even for small phylogenetic studies, it has been demonstrated that local search heuristics may only provide sub-optimal solutions. Obviating such heuristic uncertainties are exact dynamic programming solutions that allow solving tree reconciliation problems for smaller phylogenetic studies. Despite these promises, such exact solutions are only suitable for credibly rooted input gene trees, which constitute only a tiny fraction of the readily available gene trees. Standard gene tree inference approaches provide only unrooted gene trees and accurately rooting such trees is often difficult, if not impossible. Results: Here, we describe complex dynamic programming solutions that represent the first nonnaïve exact solutions for solving the tree reconciliation problems for unrooted input gene trees. Further, we show that the asymptotic runtime of the proposed solutions does not increase when compared to the most time-efficient dynamic programming solutions for rooted input trees. Conclusions: In an experimental evaluation, we demonstrate that the described solutions for unrooted gene trees are, like the solutions for rooted input gene trees, suitable for smaller phylogenetic studies. Finally, for the first time, we study the accuracy of classic local search heuristics for unrooted tree reconciliation problems. Keywords: Median tree, Reconciliation, Gene duplication, Gene loss, Deep coalescence, Exact solution

Background Phylogenetic trees visualize estimates of the evolutionary relationships among multiple biological entities such as molecular sequences, genomes, and species. For biologists, such trees present a fundamental tool for analyzing how distinct biological entities have evolved but are full of complexities and seemingly irreconcilable differences [1]. The implications and potential applications of *Correspondence: [email protected] University of Warsaw, Faculty of Mathematics, Informatics and Mechanics, Banacha 2, 02-097 Warsaw, Poland Full list of author information is available at the end of the article 1

phylogenetic analyses are widespread and are concerning a wide variety of central research areas including biology, ecology, epidemiology, and conservation biology (e.g., [2–6]). Conventional phylogenetic tree inference samples an individual gene (i.e., a gene family) for a collection of species and reconstructs the evolutionary history, called a gene tree, of this gene. The gene is a portion of the spe