Bayes Factors Unmask Highly Variable Information Content, Bias, and Extreme Influence in Phylogenomic Analyses.

Syst Biol

Department of Biology, University of Hawaíi at Manoa, 2538 McCarthy Mall, Edmondson Hall Rm 216, Honolulu, HI 96822, USA.

Published: July 2017

As the application of genomic data in phylogenetics has become routine, a number of cases have arisen where alternative data sets strongly support conflicting conclusions. This sensitivity to analytical decisions has prevented firm resolution of some of the most recalcitrant nodes in the tree of life. To better understand the causes and nature of this sensitivity, we analyzed several phylogenomic data sets using an alternative measure of topological support (the Bayes factor) that both demonstrates and averts several limitations of more frequently employed support measures (such as Markov chain Monte Carlo estimates of posterior probabilities). Bayes factors reveal important, previously hidden, differences across six "phylogenomic" data sets collected to resolve the phylogenetic placement of turtles within Amniota. These data sets vary substantially in their support for well-established amniote relationships, particularly in the proportion of genes that contain extreme amounts of information as well as the proportion that strongly reject these uncontroversial relationships. All six data sets contain little information to resolve the phylogenetic placement of turtles relative to other amniotes. Bayes factors also reveal that a very small number of extremely influential genes (less than 1% of genes in a data set) can fundamentally change significant phylogenetic conclusions. In one example, these genes are shown to contain previously unrecognized paralogs. This study demonstrates both that the resolution of difficult phylogenomic problems remains sensitive to seemingly minor analysis details and that Bayes factors are a valuable tool for identifying and solving these challenges.

Download full-text PDF

Source
http://dx.doi.org/10.1093/sysbio/syw101DOI Listing

Publication Analysis

Top Keywords

data sets
20
bayes factors
16
factors reveal
8
resolve phylogenetic
8
phylogenetic placement
8
placement turtles
8
data
7
bayes
5
sets
5
factors unmask
4

Similar Publications

Monitoring and assessing the level of lower limb motor skills using the Biodex System plays an important role in the training of football players and in post-traumatic rehabilitation. The aim of this study was to build and test an artificial intelligence-based model to assess the peak torque of the lower limb extensors and flexors. The model was based on real-world results in three groups: hearing ( = 19) and deaf football players ( = 28) and non-training deaf pupils ( = 46).

View Article and Find Full Text PDF

Background: This two-stage individual patient data meta-analysis (IPD-MA) compared the efficacy of a shorter duration (≤ 2 days) of vasoactive (VA) drug therapy to standard duration (3-5 days) after acute variceal bleeding (AVB) in patients with liver cirrhosis.

Patients And Methods: Randomized clinical trials on patients with cirrhosis and AVB undergoing endoscopic band ligation which compared a short duration versus the standard duration of VA therapy were included. The primary outcome was 5-day rebleeding rate.

View Article and Find Full Text PDF

Background: The differential impact of serum lipids and their targets for lipid modification on cardiometabolic disease risk is debated. This study used Mendelian randomization to investigate the causal relationships and underlying mechanisms.

Methods: Genetic variants related to lipid profiles and targets for lipid modification were sourced from the Global Lipids Genetics Consortium.

View Article and Find Full Text PDF

Background: Osteoporosis is a common age-related disease with disabling consequences, the early diagnosis of which is difficult due to its long and hidden course, which often leads to diagnosis only after a fracture. In this regard, great expectations are placed on advanced developments in machine learning technologies aimed at predicting osteoporosis at an early stage of development, including the use of large data sets containing information on genetic and clinical predictors of the disease. Nevertheless, the inclusion of DNA markers in prediction models is fraught with a number of difficulties due to the complex polygenic and heterogeneous nature of the disease.

View Article and Find Full Text PDF

Proteins can be rapidly prototyped with cell-free expression (CFE) but in most cases there is a lack of probes or assays to measure their function directly in the cell lysate, thereby limiting the throughput of these screens. Increased throughput is needed to build standardized, sequence to function data sets to feed machine learning guided protein optimization. Herein, we describe the use of fluorescent single-walled carbon nanotubes (SWCNT) as effective probes for measuring protease activity directly in cell-free lysate.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!