A Fast and Exact Algorithm for the Exemplar Breakpoint Distance.

J Comput Biol

Laboratory for Computational Biology and Bioinformatics, School of Computer and Communication Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland .

Published: May 2016

A fundamental problem in comparative genomics is to compute the distance between two genomes. For two genomes without duplicate genes, we can easily compute a variety of distance measures in linear time, but the problem is NP-hard under most models when genomes contain duplicate genes. Sankoff proposed the use of exemplars to tackle the problem of duplicate genes and gene families: each gene family is represented by a single gene (the exemplar for that family), chosen so as to optimize some metric. Unfortunately, choosing exemplars is itself an NP-hard problem. In this article, we propose a very fast and exact algorithm to compute the exemplar breakpoint distance, based on new insights in the underlying structure of genome rearrangements and exemplars. We evaluate the performance of our algorithm on simulation data and compare its performance to the best effort to date (a divide-and-conquer approach), showing that our algorithm runs much faster and scales much better. We also devise a new algorithm for the intermediate breakpoint distance problem, which can then be applied to assign orthologs. We compare our algorithm with the state-of-the-art method MSOAR by assigning orthologs among five well annotated mammalian genomes, showing that our algorithm runs much faster and is slightly more accurate than MSOAR.

Download full-text PDF

Source
http://dx.doi.org/10.1089/cmb.2015.0193DOI Listing

Publication Analysis

Top Keywords

breakpoint distance
12
duplicate genes
12
fast exact
8
exact algorithm
8
exemplar breakpoint
8
genomes duplicate
8
showing algorithm
8
algorithm runs
8
runs faster
8
algorithm
7

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!