Publications by Fabio Pardi | LitMetric

Publications by authors named "Fabio Pardi"

Page 1 of 2

EPIK: precise and scalable evolutionary placement with informative k-mers.

Nikolai Romashchenko Benjamin Linard Fabio Pardi Eric Rivals

Bioinformatics

December 2023

Motivation: Phylogenetic placement enables phylogenetic analysis of massive collections of newly sequenced DNA, when de novo tree inference is too unreliable or inefficient. Assuming that a high-quality reference tree is available, the idea is to seek the correct placement of the new sequences in that tree. Recently, alignment-free approaches to phylogenetic placement have emerged, both to circumvent the need to align the new sequences and to avoid the calculations that typically follow the alignment step.

View Article and Find Full Text PDF

Computing Phylo- k-Mers.

Nikolai Romashchenko Benjamin Linard Eric Rivals Fabio Pardi

IEEE/ACM Trans Comput Biol Bioinform

October 2023

Finding the correct position of new sequences within an established phylogenetic tree is an increasingly relevant problem in evolutionary bioinformatics and metagenomics. Recently, alignment-free approaches for this task have been proposed. One such approach is based on the concept of phylogenetically-informative k-mers or phylo- k-mers for short.

View Article and Find Full Text PDF

On the inference of complex phylogenetic networks by Markov Chain Monte-Carlo.

Charles-Elie Rabier Vincent Berry Marnus Stoltz João D Santos Wensheng Wang Fabio Pardi

PLoS Comput Biol

September 2021

For various species, high quality sequences and complete genomes are nowadays available for many individuals. This makes data analysis challenging, as methods need not only to be accurate, but also time efficient given the tremendous amount of data to process. In this article, we introduce an efficient method to infer the evolutionary history of individuals under the multispecies coalescent model in networks (MSNC).

View Article and Find Full Text PDF

A phylogenetic approach for weighting genetic sequences.

Nicola De Maio Alexander V Alekseyenko William J Coleman-Smith Fabio Pardi Marc A Suchard

BMC Bioinformatics

May 2021

Background: Many important applications in bioinformatics, including sequence alignment and protein family profiling, employ sequence weighting schemes to mitigate the effects of non-independence of homologous sequences and under- or over-representation of certain taxa in a dataset. These schemes aim to assign high weights to sequences that are 'novel' compared to the others in the same dataset, and low weights to sequences that are over-represented.

Results: We formalise this principle by rigorously defining the evolutionary 'novelty' of a sequence within an alignment.

View Article and Find Full Text PDF

Computing the probability of gene trees concordant with the species tree in the multispecies coalescent.

Jakub Truszkowski Celine Scornavacca Fabio Pardi

Theor Popul Biol

February 2021

The multispecies coalescent process models the genealogical relationships of genes sampled from several species, enabling useful predictions about phenomena such as the discordance between a gene tree and the species phylogeny due to incomplete lineage sorting. Conversely, knowledge of large collections of gene trees can inform us about several aspects of the species phylogeny, such as its topology and ancestral population sizes. A fundamental open problem in this context is how to efficiently compute the probability of a gene tree topology, given the species phylogeny.

View Article and Find Full Text PDF

Rapid screening and detection of inter-type viral recombinants using phylo-k-mers.

Guillaume E Scholz Benjamin Linard Nikolai Romashchenko Eric Rivals Fabio Pardi

Bioinformatics

April 2021

Motivation: Novel recombinant viruses may have important medical and evolutionary significance, as they sometimes display new traits not present in the parental strains. This is particularly concerning when the new viruses combine fragments coming from phylogenetically distinct viral types. Here, we consider the task of screening large collections of sequences for such novel recombinants.

View Article and Find Full Text PDF

PEWO: a collection of workflows to benchmark phylogenetic placement.

Benjamin Linard Nikolai Romashchenko Fabio Pardi Eric Rivals

Bioinformatics

January 2021

Motivation: Phylogenetic placement (PP) is a process of taxonomic identification for which several tools are now available. However, it remains difficult to assess which tool is more adapted to particular genomic data or a particular reference taxonomy. We developed Placement Evaluation WOrkflows (PEWO), the first benchmarking tool dedicated to PP assessment.

View Article and Find Full Text PDF

Rapid alignment-free phylogenetic identification of metagenomic sequences.

Benjamin Linard Krister Swenson Fabio Pardi

Bioinformatics

September 2019

Motivation: Taxonomic classification is at the core of environmental DNA analysis. When a phylogenetic tree can be built as a prior hypothesis to such classification, phylogenetic placement (PP) provides the most informative type of classification because each query sequence is assigned to its putative origin in the tree. This is useful whenever precision is sought (e.

View Article and Find Full Text PDF

Finding a most parsimonious or likely tree in a network with respect to an alignment.

Steven Kelk Fabio Pardi Celine Scornavacca Leo van Iersel

J Math Biol

January 2019

Phylogenetic networks are often constructed by merging multiple conflicting phylogenetic signals into a directed acyclic graph. It is interesting to explore whether a network constructed in this way induces biologically-relevant phylogenetic signals that were not present in the input. Here we show that, given a multiple alignment A for a set of taxa X and a rooted phylogenetic network N whose leaves are labelled by X, it is NP-hard to locate a most parsimonious phylogenetic tree displayed by N (with respect to A) even when the level of N-the maximum number of reticulation nodes within a biconnected component-is 1 and A contains only 2 distinct states.

View Article and Find Full Text PDF

Rearrangement moves on rooted phylogenetic networks.

Philippe Gambette Leo van Iersel Mark Jones Manuel Lafond Fabio Pardi

PLoS Comput Biol

August 2017

Phylogenetic tree reconstruction is usually done by local search heuristics that explore the space of the possible tree topologies via simple rearrangements of their structure. Tree rearrangement heuristics have been used in combination with practically all optimization criteria in use, from maximum likelihood and parsimony to distance-based principles, and in a Bayesian context. Their basic components are rearrangement moves that specify all possible ways of generating alternative phylogenies from a given one, and whose fundamental property is to be able to transform, by repeated application, any phylogeny into any other phylogeny.

View Article and Find Full Text PDF

Do Branch Lengths Help to Locate a Tree in a Phylogenetic Network?

Philippe Gambette Leo van Iersel Steven Kelk Fabio Pardi Celine Scornavacca

Bull Math Biol

September 2016

Phylogenetic networks are increasingly used in evolutionary biology to represent the history of species that have undergone reticulate events such as horizontal gene transfer, hybrid speciation and recombination. One of the most fundamental questions that arise in this context is whether the evolution of a gene with one copy in all species can be explained by a given network. In mathematical terms, this is often translated in the following way: is a given phylogenetic tree contained in a given phylogenetic network? Recently this tree containment problem has been widely investigated from a computational perspective, but most studies have only focused on the topology of the phylogenies, ignoring a piece of information that, in the case of phylogenetic trees, is routinely inferred by evolutionary analyses: branch lengths.

View Article and Find Full Text PDF

Fast and accurate branch lengths estimation for phylogenomic trees.

Manuel Binet Olivier Gascuel Celine Scornavacca Emmanuel J P Douzery Fabio Pardi

BMC Bioinformatics

January 2016

Background: Branch lengths are an important attribute of phylogenetic trees, providing essential information for many studies in evolutionary biology. Yet, part of the current methodology to reconstruct a phylogeny from genomic information - namely supertree methods - focuses on the topology or structure of the phylogenetic tree, rather than the evolutionary divergences associated to it. Moreover, accurate methods to estimate branch lengths - typically based on probabilistic analysis of a concatenated alignment - are limited by large demands in memory and computing time, and may become impractical when the data sets are too large.

View Article and Find Full Text PDF

Reconstructible phylogenetic networks: do not distinguish the indistinguishable.

Fabio Pardi Celine Scornavacca

PLoS Comput Biol

April 2015

Phylogenetic networks represent the evolution of organisms that have undergone reticulate events, such as recombination, hybrid speciation or lateral gene transfer. An important way to interpret a phylogenetic network is in terms of the trees it displays, which represent all the possible histories of the characters carried by the organisms in the network. Interestingly, however, different networks may display exactly the same set of trees, an observation that poses a problem for network reconstruction: from the perspective of many inference methods such networks are "indistinguishable".

View Article and Find Full Text PDF

Combinatorics of distance-based tree inference.

Fabio Pardi Olivier Gascuel

Proc Natl Acad Sci U S A

October 2012

Several popular methods for phylogenetic inference (or hierarchical clustering) are based on a matrix of pairwise distances between taxa (or any kind of objects): The objective is to construct a tree with branch lengths so that the distances between the leaves in that tree are as close as possible to the input distances. If we hold the structure (topology) of the tree fixed, in some relevant cases (e.g.

View Article and Find Full Text PDF

Robustness of phylogenetic inference based on minimum evolution.

Fabio Pardi Sylvain Guillemot Olivier Gascuel

Bull Math Biol

October 2010

Minimum evolution is the guiding principle of an important class of distance-based phylogeny reconstruction methods, including neighbor-joining (NJ), which is the most cited tree inference algorithm to date. The minimum evolution principle involves searching for the tree with minimum length, where the length is estimated using various least-squares criteria. Since evolutionary distances cannot be known precisely but only estimated, it is important to investigate the robustness of phylogenetic reconstruction to imprecise estimates for these distances.

View Article and Find Full Text PDF

Approximate maximum parsimony and ancestral maximum likelihood.

Noga Alon Benny Chor Fabio Pardi Anat Rapoport

IEEE/ACM Trans Comput Biol Bioinform

May 2010

We explore the maximum parsimony (MP) and ancestral maximum likelihood (AML) criteria in phylogenetic tree reconstruction. Both problems are NP-hard, so we seek approximate solutions. We formulate the two problems as Steiner tree problems under appropriate distances.

View Article and Find Full Text PDF

Budgeted phylogenetic diversity on circular split systems.

Bui Quang Minh Fabio Pardi Steffen Klaere Arndt von Haeseler

IEEE/ACM Trans Comput Biol Bioinform

August 2009

In the last 15 years, Phylogenetic Diversity (PD) has gained interest in the community of conservation biologists as a surrogate measure for assessing biodiversity. We have recently proposed two approaches to select taxa for maximizing PD, namely PD with budget constraints and PD on split systems. In this paper, we will unify these two strategies and present a dynamic programming algorithm to solve the unified framework of selecting taxa with maximal PD under budget constraints on circular split systems.

View Article and Find Full Text PDF

Distribution of phylogenetic diversity under random extinction.

Beáta Faller Fabio Pardi Mike Steel

J Theor Biol

March 2008

Phylogenetic diversity is a measure for describing how much of an evolutionary tree is spanned by a subset of species. If one applies this to the unknown subset of current species that will still be present at some future time, then this 'future phylogenetic diversity' provides a measure of the impact of various extinction scenarios in biodiversity conservation. In this paper, we study the distribution of future phylogenetic diversity under a simple model of extinction (a generalized 'field of bullets' model).

View Article and Find Full Text PDF

Determination and validation of principal gene products.

Michael L Tress Jan-Jaap Wesselink Adam Frankish Gonzalo López Nick Goldman Fabio Pardi

Bioinformatics

January 2008

Motivation: Alternative splicing has the potential to generate a wide range of protein isoforms. For many computational applications and for experimental research, it is important to be able to concentrate on the isoform that retains the core biological function. For many genes this is far from clear.

View Article and Find Full Text PDF

Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project.

Nature

June 2007

Article Synopsis

- The study reports on experiments analyzing a targeted 1% of the human genome during the ENCODE Project's pilot phase, providing crucial insights into human genome function.
- Findings reveal that the human genome is largely transcribed, with evidence showing that most genomic bases contribute to various types of transcripts, including those that do not code for proteins.
- Enhanced understanding of transcription regulation, chromatin structure, and evolutionary insights from comparisons between species help define the functional landscape of the human genome, guiding future research in genome characterization.

View Article and Find Full Text PDF

Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome.

Elliott H Margulies Gregory M Cooper George Asimenos Daryl J Thomas Colin N Dewey Fabio Pardi

Genome Res

June 2007

A key component of the ongoing ENCODE project involves rigorous comparative sequence analyses for the initially targeted 1% of the human genome. Here, we present orthologous sequence generation, alignment, and evolutionary constraint analyses of 23 mammalian species for all ENCODE targets. Alignments were generated using four different methods; comparisons of these methods reveal large-scale consistency but substantial differences in terms of small genomic rearrangements, sensitivity (sequence coverage), and specificity (alignment accuracy).

View Article and Find Full Text PDF

Resource-aware taxon selection for maximizing phylogenetic diversity.

Fabio Pardi Nick Goldman

Syst Biol

June 2007

Phylogenetic diversity (PD) is a useful metric for selecting taxa in a range of biological applications, for example, bioconservation and genomics, where the selection is usually constrained by the limited availability of resources. We formalize taxon selection as a conceptually simple optimization problem, aiming to maximize PD subject to resource constraints. This allows us to take into account the different amounts of resources required by the different taxa.

View Article and Find Full Text PDF

Species choice for comparative genomics: being greedy works.

Fabio Pardi Nick Goldman

PLoS Genet

December 2005

Several projects investigating genetic function and evolution through sequencing and comparison of multiple genomes are now underway. These projects consume many resources, and appropriate planning should be devoted to choosing which species to sequence, potentially involving cooperation among different sequencing centres. A widely discussed criterion for species choice is the maximisation of evolutionary divergence.

View Article and Find Full Text PDF

GSMA: software implementation of the genome search meta-analysis method.

Fabio Pardi Douglas F Levinson Cathryn M Lewis

Bioinformatics

December 2005

Meta-analysis can be used to pool results of genome-wide linkage scans. This is of great value in complex diseases, where replication of linked regions occurs infrequently. The genome search meta-analysis (GSMA) method is widely used for this analysis, and a computer program is now available to implement the GSMA.

View Article and Find Full Text PDF

Meta-analysis of genome scans of age-related macular degeneration.

Sheila A Fisher Goncalo R Abecasis Beverly M Yashar Sepideh Zareparsi Anand Swaroop Fabio Pardi

Hum Mol Genet

August 2005

A genetic contribution to the development of age-related macular degeneration (AMD) is well established. Several genome-wide linkage studies have identified a number of putative susceptibility loci for AMD but only a few of these regions have been replicated in independent studies. Here, we perform a meta-analysis of six AMD genome screens using the genome-scan meta-analysis method, which allows linkage results from several studies to be combined, providing greater power to identify regions that show only weak evidence for linkage in individual studies.

View Article and Find Full Text PDF