A note on efficient computation of haplotypes via perfect phylogeny.

J Comput Biol

Department of Computer Science & Engineering, University of California at San Diego, La Jolla, CA 92093, USA.

Published: March 2005

The problem of inferring haplotype phase from a population of genotypes has received a lot of attention recently. This is partly due to the observation that there are many regions on human genomic DNA where genetic recombination is rare (Helmuth, 2001; Daly et al., 2001; Stephens et al., 2001; Friss et al., 2001). A Haplotype Map project has been announced by NIH to identify and characterize populations in terms of these haplotypes. Recently, Gusfield introduced the perfect phylogeny haplotyping problem, as an algorithmic implication of the no-recombination in long blocks observation, together with the standard population-genetic assumption of infinite sites. Gusfield's solution based on matroid theory was followed by direct theta(nm2) solutions that use simpler techniques (Bafna et al., 2003; Eskin et al., 2003), and also bound the number of solutions to the PPH problem. In this short note, we address two questions that were left open. First, can the algorithms of Bafna et al. (2003) and Eskin et al. (2003) be sped-up to O(nm + m2) time, which would imply an O(nm) time-bound for the PPH problem? Second, if there are multiple solutions, can we find one that is most parsimonious in terms of the number of distinct haplotypes. We give reductions that suggests that the answer to both questions is "no." For the first problem, we show that computing the output of the first step (in either method) is equivalent to Boolean matrix multiplication. Therefore, the best bound we can presently achieve is O(nm(omega-1)), where omega < or = 2.52 is the exponent of matrix multiplication. Thus, any linear time solution to the PPH problem likely requires a different approach. For the second problem of computing a PPH solution that minimizes the number of distinct haplotypes, we show that the problem is NP-hard using a reduction from Vertex Cover (Garey and Johnson, 1979).

Download full-text PDF

Source
http://dx.doi.org/10.1089/cmb.2004.11.858DOI Listing

Publication Analysis

Top Keywords

perfect phylogeny
8
bafna 2003
8
2003 eskin
8
eskin 2003
8
pph problem
8
number distinct
8
distinct haplotypes
8
problem computing
8
matrix multiplication
8
problem
7

Similar Publications

Article Synopsis
  • Convergent evolution in similar environments supports the idea of natural selection but often shows imperfect similarities due to differing environmental factors and genetic variations.
  • The study of 212 species of stick and leaf insects revealed that lineages independently adapted to similar habitats, demonstrating parallel morphological changes, though with variations in magnitude and direction.
  • Findings indicate that environmental similarity drives closer morphological traits among lineages, while genetic divergence over time also plays a significant, predictable role in the extent of convergence.
View Article and Find Full Text PDF

In Bayesian phylogenetic and phylodynamic studies it is common to summarise the posterior distribution of trees with a time-calibrated consensus phylogeny. While the maximum clade credibility (MCC) tree is often used for this purpose, we here show that a novel consensus tree method - the highest independent posterior subtree reconstruction, or HIPSTR - contains consistently higher supported clades over MCC. We also provide faster computational routines for estimating both consensus trees in an updated version of TreeAnnotator X, an open-source software program that summarizes the information from a sample of trees and returns many helpful statistics such as individual clade credibilities contained in the consensus tree.

View Article and Find Full Text PDF

Biological Characteristics of a Novel Bibenzyl Synthase () Gene from Catalyzing Dihydroresveratrol Synthesis.

Molecules

November 2024

Key Laboratory for Biodiversity Conservation in Karst Mountain Area of Southwestern China, National Foresty and Grassland Administration, Guiyang 550005, China.

Bibenzyl compounds are one of the most important bioactive components of natural medicine. However, as a traditional herbal medicine is rich in bibenzyl compounds and performs functions such as acting as an antioxidant, inhibiting cancer cell growth, and assisting in neuro-protection. The biosynthesis of bibenzyl products is regulated by bibenzyl synthase (BBS).

View Article and Find Full Text PDF

The Bayesian Phylogenetic Bootstrap and its Application to Short Trees and Branches.

Mol Biol Evol

November 2024

Institut de Systématique Evolution, Biodiversité (ISYEB UMR7205-CNRS, Muséum National d'Histoire Naturelle, SU, EPHE, UA), Paris 75005, France.

Felsenstein's bootstrap is the most commonly used method to measure branch support in phylogenetics. Current sequencing technologies can result in massive sampling of taxa (e.g.

View Article and Find Full Text PDF

Background: Astragalus membranaceus (Fisch.) Bunge is one of the most well-known tonic herbs in traditional Chinese medicine, renowned for its remarkable medicinal value in various clinical contexts. The corresponding chloroplast (cp) and nuclear genomes have since been accordingly sequenced, providing valuable information for breeding and phylogeny studies.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!