Finding local genome rearrangements

  • PDF / 2,349,504 Bytes
  • 14 Pages / 595.276 x 790.866 pts Page_size
  • 25 Downloads / 212 Views

DOWNLOAD

REPORT


Algorithms for Molecular Biology Open Access

RESEARCH

Finding local genome rearrangements Pijus Simonaitis1 and Krister M. Swenson1,2* 

Abstract  Background:  The double cut and join (DCJ) model of genome rearrangement is well studied due to its mathematical simplicity and power to account for the many events that transform gene order. These studies have mostly been devoted to the understanding of minimum length scenarios transforming one genome into another. In this paper we search instead for rearrangement scenarios that minimize the number of rearrangements whose breakpoints are unlikely due to some biological criteria. One such criterion has recently become accessible due to the advent of the Hi-C experiment, facilitating the study of 3D spacial distance between breakpoint regions. Results:  We establish a link between the minimum number of unlikely rearrangements required by a scenario and the problem of finding a maximum edge-disjoint cycle packing on a certain transformed version of the adjacency graph. This link leads to a 3/2-approximation as well as an exact integer linear programming formulation for our problem, which we prove to be NP-complete. We also present experimental results on fruit flies, showing that Hi-C data is informative when used as a criterion for rearrangements. Conclusions:  A new variant of the weighted DCJ distance problem is addressed that ignores scenario length in its objective function. A solution to this problem provides a lower bound on the number of unlikely moves necessary when transforming one gene order into another. This lower bound aids in the study of rearrangement scenarios with respect to chromatin structure, and could eventually be used in the design of a fixed parameter algorithm with a more general objective function. Keywords:  Genome rearrangement, Double cut and join, Hi-C, Chromatin conformation, Maximum edge-disjoint cycle packing, NP-complete Background The problem of sorting genomes by a minimum number of biologically plausible rearrangements has been central to the theoretical comparative genomics community for roughly a quarter century. Traditionally, the likelihood of a rearrangement scenario has been based solely on the parsimony criterion. Unfortunately, a huge number of possible parsimonious scenarios between a pair of genomes exists  [1–3]. This highlights the importance of methods that infer scenarios which conform to some extra biological constraints. To this end we interest ourselves in data describing the 3D organization of chromatin, which is increasingly available due to the advent of an experiment called Hi-C *Correspondence: [email protected] 1 CNRS, LIRMM, Université Montpellier, 161 rue Ada, 34392 Montpellier, France Full list of author information is available at the end of the article

[4, 5]. Indeed, the 3D spatial proximity of breakpoint regions have an important role in the formation [6, 7] and fixation [8] of genome rearrangements. We have started development of methodology suitable for use with this type of constraint. Syntenic blocks o