Models and algorithms for genome rearrangement with positional constraints

PDF / 1,657,384 Bytes
10 Pages / 595.276 x 790.866 pts Page_size
111 Downloads / 226 Views

Swenson et al. Algorithms Mol Biol (2016) 11:13 DOI 10.1186/s13015-016-0065-9

Open Access

RESEARCH

Models and algorithms for genome rearrangement with positional constraints Krister M. Swenson1,2*†, Pijus Simonaitis3† and Mathieu Blanchette4†

Abstract Background: Traditionally, the merit of a rearrangement scenario between two gene orders has been measured based on a parsimony criteria alone; two scenarios with the same number of rearrangements are considered equally good. In this paper, we acknowledge that each rearrangement has a certain likelihood of occurring based on biological constraints, e.g. physical proximity of the DNA segments implicated or repetitive sequences. Results: We propose optimization problems with the objective of maximizing overall likelihood, by weighting the rearrangements. We study a binary weight function suitable to the representation of sets of genome positions that are most likely to have swapped adjacencies. We give a polynomial-time algorithm for the problem of finding a minimum weight double cut and join scenario among all minimum length scenarios. In the process we solve an optimization problem on colored noncrossing partitions, which is a generalization of the Maximum Independent Set problem on circle graphs. Conclusions: We introduce a model for weighting genome rearrangements and show that under simple yet reasonable conditions, a fundamental distance can be computed in polynomial time. This is achieved by solving a generalization of the Maximum Independent Set problem on circle graphs. Several variants of the problem are also mentioned. Keywords: Double cut and join (DCJ), Weighted genome rearrangement, Noncrossing partitions, Chromatin conformation, Hi-C Background A huge body of work exists on modeling the evolution of whole chromosomes [1]. The main difference between such models is the set of rearrangements that they allow. The moves of interest are usually inversion, transposition, translocation, chromosome fission and fusion, deletion, insertion, and duplication. Almost all versions of the problem are NP-Hard if content modifying operations such at duplication, loss, and insertion are allowed [2, 3]. Fortunately, a model that considers genomes with equal content (i.e., no duplications or insertions/deletions) is quite pertinent, particularly in eukaryotes, since syntenic blocks of genes can be assigned between genomes so that each block *Correspondence: [email protected] † Krister M. Swenson, Pijus Simonaitis and Mathieu Blanchette contributed equally to this work 2 Institut de Biologie Computationnelle (IBC), Montpellier, France Full list of author information is available at the end of the article

occurs exactly once in each genome. For two genomes with equal content, double cut and join (DCJ) has been the model of choice since it elegantly includes inversion, translocation, chromosome circularization and linearization, as well as chromosome fission and fusion [4, 5]. One of the most important problems in comparative genomics is the inference of ancestral ge

Data Loading...

Models and algorithms for genome rearrangement with positional constraints

Recommend Documents

Minimal Genome Design Algorithms Using Whole-Cell Models

Logics with Simple Constraints on Models

Heuristics for Breakpoint Graph Decomposition with Applications in Genome Rearrangement Problems

Models and algorithms for decomposition problems

Probabilistic Models and Randomised Algorithms

Amadori Rearrangement

Multiagent Scheduling Models and Algorithms

Constraints, Plasticity, and Universal Patterns in Genome and Phenome Evolution

Benchmarking Inference Algorithms for Probabilistic Relational Models

Network Flow with Intermediate Storage: Models and Algorithms

Genome-Scale Metabolic Models: Reconstruction and Analysis

Fuzzy Models and Algorithms for Pattern Recognition and Image Processing