Publications by Alan Filipski

Publications by authors named "Alan Filipski"

Page 1 of 1

Phylogenetic placement of metagenomic reads using the minimum evolution principle.

Alan Filipski Koichiro Tamura Paul Billing-Ross Oscar Murillo Sudhir Kumar

BMC Genomics

January 2016

Background: A central problem of computational metagenomics is determining the correct placement into an existing phylogenetic tree of individual reads (nucleotide sequences of varying lengths, ranging from hundreds to thousands of bases) obtained using next-generation sequencing of DNA samples from a mixture of known and unknown species. Correct placement allows us to easily identify or classify the sequences in the sample as to taxonomic position or function.

Results: Here we propose a novel method (PhyClass), based on the Minimum Evolution (ME) phylogenetic inference criterion, for determining the appropriate phylogenetic position of each read.

View Article and Find Full Text PDF

A Protocol for Diagnosing the Effect of Calibration Priors on Posterior Time Estimates: A Case Study for the Cambrian Explosion of Animal Phyla.

Fabia U Battistuzzi Paul Billing-Ross Oscar Murillo Alan Filipski Sudhir Kumar

Mol Biol Evol

July 2015

We present a procedure to test the effect of calibration priors on estimated times, which applies a recently developed calibration-free approach (RelTime) method that produces relative divergence times for all nodes in the tree. We illustrate this protocol by applying it to a timetree of metazoan diversification (Erwin DH, Laflamme M, Tweedt SM, Sperling EA, Pisani D, Peterson KJ. 2011.

View Article and Find Full Text PDF

Prospects for building large timetrees using molecular data with incomplete gene coverage among species.

Alan Filipski Oscar Murillo Anna Freydenzon Koichiro Tamura Sudhir Kumar

Mol Biol Evol

September 2014

Scientists are assembling sequence data sets from increasing numbers of species and genes to build comprehensive timetrees. However, data are often unavailable for some species and gene combinations, and the proportion of missing data is often large for data sets containing many genes and species. Surprisingly, there has not been a systematic analysis of the effect of the degree of sparseness of the species-gene matrix on the accuracy of divergence time estimates.

View Article and Find Full Text PDF

MEGA6: Molecular Evolutionary Genetics Analysis version 6.0.

Koichiro Tamura Glen Stecher Daniel Peterson Alan Filipski Sudhir Kumar

Mol Biol Evol

December 2013

We announce the release of an advanced version of the Molecular Evolutionary Genetics Analysis (MEGA) software, which currently contains facilities for building sequence alignments, inferring phylogenetic histories, and conducting molecular evolutionary analysis. In version 6.0, MEGA now enables the inference of timetrees, as it implements the RelTime method for estimating divergence times for all branching points in a phylogeny.

View Article and Find Full Text PDF

Estimating divergence times in large molecular phylogenies.

Koichiro Tamura Fabia Ursula Battistuzzi Paul Billing-Ross Oscar Murillo Alan Filipski

Proc Natl Acad Sci U S A

November 2012

Molecular dating of species divergences has become an important means to add a temporal dimension to the Tree of Life. Increasingly larger datasets encompassing greater taxonomic diversity are becoming available to generate molecular timetrees by using sophisticated methods that model rate variation among lineages. However, the practical application of these methods is challenging because of the exorbitant calculation times required by current methods for contemporary data sizes, the difficulty in correctly modeling the rate heterogeneity in highly diverse taxonomic groups, and the lack of reliable clock calibrations and their uncertainty distributions for most groups of species.

View Article and Find Full Text PDF

Statistics and truth in phylogenomics.

Sudhir Kumar Alan J Filipski Fabia U Battistuzzi Sergei L Kosakovsky Pond Koichiro Tamura

Mol Biol Evol

February 2012

Phylogenomics refers to the inference of historical relationships among species using genome-scale sequence data and to the use of phylogenetic analysis to infer protein function in multigene families. With rapidly decreasing sequencing costs, phylogenomics is becoming synonymous with evolutionary analysis of genome-scale and taxonomically densely sampled data sets. In phylogenetic inference applications, this translates into very large data sets that yield evolutionary and functional inferences with extremely small variances and high statistical confidence (P value).

View Article and Find Full Text PDF

Phylomedicine: an evolutionary telescope to explore and diagnose the universe of disease mutations.

Sudhir Kumar Joel T Dudley Alan Filipski Li Liu

Trends Genet

September 2011

Modern technologies have made the sequencing of personal genomes routine. They have revealed thousands of nonsynonymous (amino acid altering) single nucleotide variants (nSNVs) of protein-coding DNA per genome. What do these variants foretell about an individual's predisposition to diseases? The experimental technologies required to carry out such evaluations at a genomic scale are not yet available.

View Article and Find Full Text PDF

Performance of relaxed-clock methods in estimating evolutionary divergence times and their credibility intervals.

Fabia U Battistuzzi Alan Filipski S Blair Hedges Sudhir Kumar

Mol Biol Evol

June 2010

The rapid expansion of sequence data and the development of statistical approaches that embrace varying evolutionary rates among lineages have encouraged many more investigators to use DNA and protein data to time species divergences. Here, we report results from a systematic evaluation, by means of computer simulation, of the performance of two frequently used relaxed-clock methods for estimating these times and their credibility intervals (CrIs). These relaxed-clock methods allow rates to vary in a phylogeny randomly over lineages (e.

View Article and Find Full Text PDF

Positional conservation and amino acids shape the correct diagnosis and population frequencies of benign and damaging personal amino acid mutations.

Sudhir Kumar Michael P Suleski Glenn J Markov Simon Lawrence Antonio Marco Alan J Filipski

Genome Res

September 2009

As the cost of DNA sequencing drops, we are moving beyond one genome per species to one genome per individual to improve prevention, diagnosis, and treatment of disease by using personal genotypes. Computational methods are frequently applied to predict impairment of gene function by nonsynonymous mutations in individual genomes and single nucleotide polymorphisms (nSNPs) in populations. These computational tools are, however, known to fail 15%-40% of the time.

View Article and Find Full Text PDF

Evolution of genes and genomes on the Drosophila phylogeny.

Nature

November 2007

Comparative analysis of multiple genomes in a phylogenetic framework dramatically improves the precision and sensitivity of evolutionary inference, producing more robust results than single-genome analyses can provide. The genomes of 12 Drosophila species, ten of which are presented here for the first time (sechellia, simulans, yakuba, erecta, ananassae, persimilis, willistoni, mojavensis, virilis and grimshawi), illustrate how rates and patterns of sequence divergence across taxa can illuminate evolutionary processes on a genomic scale. These genome sequences augment the formidable genetic tools that have made Drosophila melanogaster a pre-eminent model for animal genetics, and will further catalyse fundamental research on mechanisms of development, cell biology, genetics, disease, neurobiology, behaviour, physiology and evolution.

View Article and Find Full Text PDF

Multiple sequence alignment: in pursuit of homologous DNA positions.

Sudhir Kumar Alan Filipski

Genome Res

February 2007

DNA sequence alignment is a prerequisite to virtually all comparative genomic analyses, including the identification of conserved sequence motifs, estimation of evolutionary divergence between sequences, and inference of historical relationships among genes and species. While it is mere common sense that inaccuracies in multiple sequence alignments can have detrimental effects on downstream analyses, it is important to know the extent to which the inferences drawn from these alignments are robust to errors and biases inherent in all sequence alignments. A survey of investigations into strengths and weaknesses of sequence alignments reveals, as expected, that alignment quality is generally poor for two distantly related sequences and can often be improved by adding additional sequences as stepping stones between distantly related species.

View Article and Find Full Text PDF

Placing confidence limits on the molecular age of the human-chimpanzee divergence.

Sudhir Kumar Alan Filipski Vinod Swarna Alan Walker S Blair Hedges

Proc Natl Acad Sci U S A

December 2005

Molecular clocks have been used to date the divergence of humans and chimpanzees for nearly four decades. Nonetheless, this date and its confidence interval remain to be firmly established. In an effort to generate a genomic view of the human-chimpanzee divergence, we have analyzed 167 nuclear protein-coding genes and built a reliable confidence interval around the calculated time by applying a multifactor bootstrap-resampling approach.

View Article and Find Full Text PDF

Measuring conservation of contiguous sets of autosomal markers on bovine and porcine genomes in relation to the map of the human genome.

Zhihua Jiang Jenna S Melville Honghe Cao Sudhir Kumar Alan Filipski

Genome

August 2002

Based on published information, we have identified 991 genes and gene-family clusters for cattle and 764 for pigs that have orthologues in the human genome. The relative linear locations of these genes on human sequence maps were used as "rulers" to annotate bovine and porcine genomes based on a CSAM (contiguous sets of autosomal markers) approach. A CSAM is an uninterrupted set of markers in one genome (primary genome; the human genome in this study) that is syntenic in the other genome (secondary genome; the bovine and porcine genomes in this study).

View Article and Find Full Text PDF