Biology has become a highly mathematical discipline in which probabilistic models play a central role. As a result, research in the biological sciences is now dependent on computational tools capable of carrying out complex analyses. These tools must be validated before they can be used, but what is understood as validation varies widely among methodological contributions.
View Article and Find Full Text PDFTime-scaled phylogenetic trees are an ultimate goal of evolutionary biology and a necessary ingredient in comparative studies. The accumulation of genomic data has resolved the tree of life to a great extent, yet timing evolutionary events remain challenging if not impossible without external information such as fossil ages and morphological characters. Methods for incorporating morphology in tree estimation have lagged behind their molecular counterparts, especially in the case of continuous characters.
View Article and Find Full Text PDFPhylogenetic models have become increasingly complex, and phylogenetic data sets have expanded in both size and richness. However, current inference tools lack a model specification language that can concisely describe a complete phylogenetic analysis while remaining independent of implementation details. We introduce a new lightweight and concise model specification language, 'LPhy', which is designed to be both human and machine-readable.
View Article and Find Full Text PDFSingle-cell sequencing provides a new way to explore the evolutionary history of cells. Compared to traditional bulk sequencing, where a population of heterogeneous cells is pooled to form a single observation, single-cell sequencing isolates and amplifies genetic material from individual cells, thereby preserving the information about the origin of the sequences. However, single-cell data are more error-prone than bulk sequencing data due to the limited genomic material available per cell.
View Article and Find Full Text PDFNew Zealand, Australia, Iceland, and Taiwan all saw success in controlling their first waves of Coronavirus Disease 2019 (COVID-19). As islands, they make excellent case studies for exploring the effects of international travel and human movement on the spread of COVID-19. We employed a range of robust phylodynamic methods and genome subsampling strategies to infer the epidemiological history of Severe acute respiratory syndrome coronavirus 2 in these four countries.
View Article and Find Full Text PDFEvolutionary models account for either population- or species-level processes but usually not both. We introduce a new model, the FBD-MSC, which makes it possible for the first time to integrate both the genealogical and fossilization phenomena, by means of the multispecies coalescent (MSC) and the fossilized birth-death (FBD) processes. Using this model, we reconstruct the phylogeny representing all extant and many fossil Caninae, recovering both the relative and absolute time of speciation events.
View Article and Find Full Text PDFThe phosphoprotein gene of the paramyxoviruses encodes multiple protein products. The P, V, and W proteins are generated by transcriptional slippage. This process results in the insertion of non-templated guanosine nucleosides into the mRNA at a conserved edit site.
View Article and Find Full Text PDFReal-time genomic sequencing has played a major role in tracking the global spread of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), contributing greatly to disease mitigation strategies. In August 2020, after having eliminated the virus, New Zealand experienced a second outbreak. During that outbreak, New Zealand used genomic sequencing in a primary role, leading to a second elimination of the virus.
View Article and Find Full Text PDFSince the first wave of coronavirus disease in March 2020, citizens and permanent residents returning to New Zealand have been required to undergo managed isolation and quarantine (MIQ) for 14 days and mandatory testing for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). As of October 20, 2020, of 62,698 arrivals, testing of persons in MIQ had identified 215 cases of SARS-CoV-2 infection. Among 86 passengers on a flight from Dubai, United Arab Emirates, that arrived in New Zealand on September 29, test results were positive for 7 persons in MIQ.
View Article and Find Full Text PDFNew Zealand, a geographically remote Pacific island with easily sealable borders, implemented a nationwide 'lockdown' of all non-essential services to curb the spread of COVID-19. Here, we generate 649 SARS-CoV-2 genome sequences from infected patients in New Zealand with samples collected during the 'first wave', representing 56% of all confirmed cases in this time period. Despite its remoteness, the viruses imported into New Zealand represented nearly all of the genomic diversity sequenced from the global virus population.
View Article and Find Full Text PDFBackground: Bayesian MCMC has become a common approach for phylogenetic inference. But the growing size of molecular sequence data sets has created a pressing need to improve the computational efficiency of Bayesian phylogenetic inference algorithms.
Results: This paper develops a new algorithm to improve the efficiency of Bayesian phylogenetic inference for models that include a per-branch rate parameter.
PLoS Comput Biol
February 2020
Transcription elongation can be modelled as a three step process, involving polymerase translocation, NTP binding, and nucleotide incorporation into the nascent mRNA. This cycle of events can be simulated at the single-molecule level as a continuous-time Markov process using parameters derived from single-molecule experiments. Previously developed models differ in the way they are parameterised, and in their incorporation of partial equilibrium approximations.
View Article and Find Full Text PDFRole of avian hosts in shaping persistence, evolution, and dispersal of global low pathogenic avian influenza virus (LPAIV) H9N2 remains uncertain. Under Bayesian Markov Chain Monte Carlo framework, we used the discrete trait analysis (DTA) to reconstruct host and location switches in the evolutionary history of global H9N2 given hemagglutinin gene sequences from 18 countries/regions between 1976 and 2018. We employed generalized linear models (GLMs) to inform virus migration rates by empirical predictors.
View Article and Find Full Text PDFModel-based phylodynamic approaches recently employed generalized linear models (GLMs) to uncover potential predictors of viral spread. Very recently some of these models have allowed both the predictors and their coefficients to be time-dependent. However, these studies mainly focused on predictors that are assumed to be constant through time.
View Article and Find Full Text PDFModern phylodynamic methods interpret an inferred phylogenetic tree as a partial transmission chain providing information about the dynamic process of transmission and removal (where removal may be due to recovery, death, or behavior change). Birth-death and coalescent processes have been introduced to model the stochastic dynamics of epidemic spread under common epidemiological models such as the SIS and SIR models and are successfully used to infer phylogenetic trees together with transmission (birth) and removal (death) rates. These methods either integrate analytically over past incidence and prevalence to infer rate parameters, and thus cannot explicitly infer past incidence or prevalence, or allow such inference only in the coalescent limit of large population size.
View Article and Find Full Text PDFElaboration of Bayesian phylogenetic inference methods has continued at pace in recent years with major new advances in nearly all aspects of the joint modelling of evolutionary data. It is increasingly appreciated that some evolutionary questions can only be adequately answered by combining evidence from multiple independent sources of data, including genome sequences, sampling dates, phenotypic data, radiocarbon dates, fossil occurrences, and biogeographic range information among others. Including all relevant data into a single joint model is very challenging both conceptually and computationally.
View Article and Find Full Text PDFSimulations are widely used to provide expectations and predictive distributions under known conditions against which to compare empirical data. Such simulations are also invaluable for testing and comparing the behaviour and power of inference methods. We describe SANTA-SIM, a software package to simulate the evolution of a population of gene sequences forwards through time.
View Article and Find Full Text PDFInvertebrates are a major component of terrestrial ecosystems, however, estimating their biodiversity is challenging. We compiled an inventory of invertebrate biodiversity along an elevation gradient on the temperate forested island of Hauturu, New Zealand, by DNA barcoding of specimens obtained from leaf litter samples and pitfall traps. We compared the barcodes and biodiversity estimates from this data set with those from a parallel DNA metabarcoding analysis of soil from the same locations, and with pre-existing sequences in reference databases, before exploring the use of combined data sets as a basis for estimating total invertebrate biodiversity.
View Article and Find Full Text PDFRapidly evolving pathogens, such as viruses and bacteria, accumulate genetic change at a similar timescale over which their epidemiological processes occur, such that, it is possible to make inferences about their infectious spread using phylogenetic time-trees. For this purpose it is necessary to choose a phylodynamic model. However, the resulting inferences are contingent on whether the model adequately describes key features of the data.
View Article and Find Full Text PDFThe Bayesian Evolutionary Analysis by Sampling Trees (BEAST) software package has become a primary tool for Bayesian phylogenetic and phylodynamic inference from genetic sequence data. BEAST unifies molecular phylogenetic reconstruction with complex discrete and continuous trait evolution, divergence-time dating, and coalescent demographic models in an efficient statistical inference engine using Markov chain Monte Carlo integration. A convenient, cross-platform, graphical user interface allows the flexible construction of complex evolutionary analyses.
View Article and Find Full Text PDFBayesian inference of phylogeny using Markov chain Monte Carlo (MCMC) plays a central role in understanding evolutionary history from molecular sequence data. Visualizing and analyzing the MCMC-generated samples from the posterior distribution is a key step in any non-trivial Bayesian inference. We present the software package Tracer (version 1.
View Article and Find Full Text PDFA birth-death-sampling model gives rise to phylogenetic trees with samples from the past and the present. Interpreting "birth" as branching speciation, "death" as extinction, and "sampling" as fossil preservation and recovery, this model - also referred to as the fossilized birth-death (FBD) model - gives rise to phylogenetic trees on extant and fossil samples. The model has been mathematically analyzed and successfully applied to a range of datasets on different taxonomic levels, such as penguins, plants, and insects.
View Article and Find Full Text PDFReticulate species evolution, such as hybridization or introgression, is relatively common in nature. In the presence of reticulation, species relationships can be captured by a rooted phylogenetic network, and orthologous gene evolution can be modeled as bifurcating gene trees embedded in the species network. We present a Bayesian approach to jointly infer species networks and gene trees from multilocus sequence data.
View Article and Find Full Text PDFPhylogenetics and phylodynamics are central topics in modern evolutionary biology. Phylogenetic methods reconstruct the evolutionary relationships among organisms, whereas phylodynamic approaches reveal the underlying diversification processes that lead to the observed relationships. These two fields have many practical applications in disciplines as diverse as epidemiology, developmental biology, palaeontology, ecology, and linguistics.
View Article and Find Full Text PDF