Publications by authors named "Hector Banos"

The tree of blobs of a species network shows only the tree-like aspects of relationships of taxa on a network, omitting information on network substructures where hybridization or other types of lateral transfer of genetic information occur. By isolating such regions of a network, inference of the tree of blobs can serve as a starting point for a more detailed investigation, or indicate the limit of what may be inferrable without additional assumptions. Building on our theoretical work on the identifiability of the tree of blobs from gene quartet distributions under the Network Multispecies Coalescent model, we develop an algorithm, TINNiK, for statistically consistent tree of blobs inference.

View Article and Find Full Text PDF
Article Synopsis
  • The Pisgah clade of Desmognathus salamanders showcases complex diversification with reticulation between lineages, involving two phenotypes: aquatic "shovel-nosed" and semi-aquatic "black-bellied" forms.
  • Geographically close populations demonstrate genetic mixing, leading to hybrid speciation dynamics, particularly between the different lineages of black-bellied salamanders.
  • Current computational challenges limit full reconstruction of their evolutionary networks, but a proposed heuristic method offers a way to discern potential reticulations and their ecological implications.
View Article and Find Full Text PDF
Article Synopsis
  • Profile mixture models help understand how amino acids swap in proteins by using different sets of amino acid compositions at various sites, with a common matrix for their exchangeabilities.
  • The GTRpmix model improves these analyses by estimating a common exchangeability matrix for multiple profiles, leading to better fit and accuracy in phylogenetic studies compared to previously used matrices like LG.
  • Two new exchangeability matrices, ELM for eukaryotic proteins and EAL for eukaryotes and Archaea, enhance the performance of phylogenetic analyses with profile mixture models, and IQ-TREE2 now supports this advanced estimation.
View Article and Find Full Text PDF

When hybridization or other forms of lateral gene transfer have occurred, evolutionary relationships of species are better represented by phylogenetic networks than by trees. While inference of such networks remains challenging, several recently proposed methods are based on quartet concordance factors-the probabilities that a tree relating a gene sampled from the species displays the possible 4-taxon relationships. Building on earlier results, we investigate what level-1 network features are identifiable from concordance factors under the network multispecies coalescent model.

View Article and Find Full Text PDF

The tree of blobs of a species network shows only the tree-like aspects of relationships of taxa on a network, omitting information on network substructures where hybridization or other types of lateral transfer of genetic information occur. By isolating such regions of a network, inference of the tree of blobs can serve as a starting point for a more detailed investigation, or indicate the limit of what may be inferrable without additional assumptions. Building on our theoretical work on the identifiability of the tree of blobs from gene quartet distributions under the Network Multispecies Coalescent model, we develop an algorithm, TINNiK, for statistically consistent tree of blobs inference.

View Article and Find Full Text PDF

Reticulations in a phylogenetic network represent processes such as gene flow, admixture, recombination and hybrid speciation. Extending definitions from the tree setting, an anomalous network is one in which some unrooted tree topology displayed in the network appears in gene trees with a lower frequency than a tree not displayed in the network. We investigate anomalous networks under the Network Multispecies Coalescent Model with possible correlated inheritance at reticulations.

View Article and Find Full Text PDF

When hybridization or other forms of lateral gene transfer have occurred, evolutionary relationships of species are better represented by phylogenetic networks than by trees. While inference of such networks remains challenging, several recently proposed methods are based on quartet concordance factors - the probabilities that a tree relating a gene sampled from the species displays the possible 4-taxon relationships. Building on earlier results, we investigate what level-1 network features are identifiable from concordance factors under the network multispecies coalescent model.

View Article and Find Full Text PDF

Biochemical constraints on the admissible amino acids at specific sites in proteins lead to heterogeneity of the amino acid substitution process over sites in alignments. It is well known that phylogenetic models of protein sequence evolution that do not account for site heterogeneity are prone to long-branch attraction (LBA) artifacts. Profile mixture models were developed to model heterogeneity of preferred amino acids at sites via a finite distribution of site classes each with a distinct set of equilibrium amino acid frequencies.

View Article and Find Full Text PDF

Reticulations in a phylogenetic network represent processes such as gene flow, admixture, recombination and hybrid speciation. Extending definitions from the tree setting, an network is one in which some unrooted tree topology displayed in the network appears in gene trees with a lower frequency than a tree not displayed in the network. We investigate anomalous networks under the Network Multispecies Coalescent Model with possible correlated inheritance at reticulations.

View Article and Find Full Text PDF

Inference of species networks from genomic data under the Network Multispecies Coalescent Model is currently severely limited by heavy computational demands. It also remains unclear how complicated networks can be for consistent inference to be possible. As a step toward inferring a general species network, this work considers its tree of blobs, in which non-cut edges are contracted to nodes, so only tree-like relationships between the taxa are shown.

View Article and Find Full Text PDF

As genomic-scale datasets motivate research on species tree inference, simulators of the multispecies coalescent (MSC) process have become essential for the testing and evaluation of new inference methods. However, the simulators themselves must be tested to ensure that they give valid samples. This work develops methods for checking whether a collection of gene trees is in accord with the MSC model on a given species tree.

View Article and Find Full Text PDF

Inference of network-like evolutionary relationships between species from genomic data must address the interwoven signals from both gene flow and incomplete lineage sorting. The heavy computational demands of standard approaches to this problem severely limit the size of datasets that may be analyzed, in both the number of species and the number of genetic loci. Here we provide a theoretical pointer to more efficient methods, by showing that logDet distances computed from genomic-scale sequences retain sufficient information to recover network relationships in the level-1 ultrametric case.

View Article and Find Full Text PDF

Summary: MSCquartets is an R package for species tree hypothesis testing, inference of species trees and inference of species networks under the Multispecies Coalescent model of incomplete lineage sorting and its network analog. Input for these analyses are collections of metric or topological locus trees which are then summarized by the quartets displayed on them. Results of hypothesis tests at user-supplied levels are displayed in a simplex plot by color-coded points.

View Article and Find Full Text PDF

Species networks generalize the notion of species trees to allow for hybridization or other lateral gene transfer. Under the network multispecies coalescent model, individual gene trees arising from a network can have any topology, but arise with frequencies dependent on the network structure and numerical parameters. We propose a new algorithm for statistical inference of a level-1 species network under this model, from data consisting of gene tree topologies, and provide the theoretical justification for it.

View Article and Find Full Text PDF

Mixtures of group-based Markov models of evolution correspond to joins of toric varieties. In this paper, we establish a large number of cases for which these phylogenetic join varieties realize their expected dimension, meaning that they are nondefective. Nondefectiveness is not only interesting from a geometric point-of-view, but has been used to establish combinatorial identifiability for several classes of phylogenetic mixture models.

View Article and Find Full Text PDF

We show that many topological features of level-1 species networks are identifiable from the distribution of the gene tree quartets under the network multi-species coalescent model. In particular, every cycle of size at least 4 and every hybrid node in a cycle of size at least 5 are identifiable. This is a step toward justifying the inference of such networks which was recently implemented by Solís-Lemus and Ané.

View Article and Find Full Text PDF