Background: The vast majority of phylogenetic trees are inferred from molecular sequence data (nucleotides or amino acids) using time-reversible evolutionary models which assume that, for any pair of nucleotide or amino acid characters, the relative rate of X to Y substitution is the same as the relative rate of Y to X substitution. However, this reversibility assumption is unlikely to accurately reflect the actual underlying biochemical and/or evolutionary processes that lead to the fixation of substitutions. Here, we use empirical viral genome sequence data to reveal that evolutionary non-reversibility is pervasive among most groups of viruses. Specifically, we consider two non-reversible nucleotide substitution models: (1) a 6-rate non-reversible model (NREV6) in which Watson-Crick complementary substitutions occur at identical relative rates and which might therefor be most applicable to analyzing the evolution of genomes where both complementary strands are subject to the same mutational processes (such as might be expected for double-stranded (ds) RNA or dsDNA genomes); and (2) a 12-rate non-reversible model (NREV12) in which all relative substitution types are free to occur at different rates and which might therefore be applicable to analyzing the evolution of genomes where the complementary genome strands are subject to different mutational processes (such as might be expected for viruses with single-stranded (ss) RNA or ssDNA genomes).

Results: Using likelihood ratio and Akaike Information Criterion-based model tests, we show that, surprisingly, NREV12 provided a significantly better fit to 21/31 dsRNA and 20/30 dsDNA datasets than did the general time reversible (GTR) and NREV6 models with NREV6 providing a better fit than NREV12 and GTR in only 5/30 dsDNA and 2/31 dsRNA datasets. As expected, NREV12 provided a significantly better fit to 24/33 ssDNA and 40/47 ssRNA datasets. Next, we used simulations to show that increasing degrees of strand-specific substitution bias decrease the accuracy of phylogenetic inference irrespective of whether GTR or NREV12 is used to describe mutational processes. However, in cases where strand-specific substitution biases are extreme (such as in SARS-CoV-2 and Torque teno sus virus datasets) NREV12 tends to yield more accurate phylogenetic trees than those obtained using GTR.

Conclusion: We show that NREV12 should, be seriously considered during the model selection phase of phylogenetic analyses involving viral genomic sequences.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9810213PMC
http://dx.doi.org/10.21203/rs.3.rs-2407778/v1DOI Listing

Publication Analysis

Top Keywords

strand-specific substitution
12
mutational processes
12
better fit
12
viral genome
8
genome sequence
8
substitution
8
substitution biases
8
non-reversible nucleotide
8
nucleotide substitution
8
substitution models
8

Similar Publications

Genetic inhibitors of APOBEC3B-induced mutagenesis.

Genome Res

September 2023

School of Molecular Biosciences and Center for Reproductive Biology, Washington State University, Pullman, Washington 99164, USA;

The cytidine deaminases APOBEC3A (A3A) and APOBEC3B (A3B) are prominent mutators of human cancer genomes. However, tumor-specific genetic modulators of APOBEC-induced mutagenesis are poorly defined. Here, we used a screen to identify 61 gene deletions that increase A3B-induced mutations in yeast.

View Article and Find Full Text PDF

The cytidine deaminases APOBEC3A and APOBEC3B (A3B) are prominent mutators of human cancer genomes. However, tumor-specific genetic modulators of APOBEC-induced mutagenesis are poorly defined. Here, we utilized a screen to identify 61 gene deletions that increase A3B-induced mutations in yeast.

View Article and Find Full Text PDF

Background: The vast majority of phylogenetic trees are inferred from molecular sequence data (nucleotides or amino acids) using time-reversible evolutionary models which assume that, for any pair of nucleotide or amino acid characters, the relative rate of X to Y substitution is the same as the relative rate of Y to X substitution. However, this reversibility assumption is unlikely to accurately reflect the actual underlying biochemical and/or evolutionary processes that lead to the fixation of substitutions. Here, we use empirical viral genome sequence data to reveal that evolutionary non-reversibility is pervasive among most groups of viruses.

View Article and Find Full Text PDF

CRISPR-Cas systems provide prokaryotes with adaptive immunity against foreign nucleic acids. In , immunity is acquired upon integration of 33-bp spacers into CRISPR arrays. DNA targets complementary to spacers get degraded and serve as a source of new spacers during a process called primed adaptation.

View Article and Find Full Text PDF

DNA from formalin-fixed paraffin-embedded (FFPE) tissues, which are frequently utilized in cancer research, is significantly affected by chemical degradation. It was suggested that approaches that are based on duplex sequencing can significantly improve the accuracy of mutation detection in FFPE-derived DNA. However, the original duplex sequencing method cannot be utilized for the analysis of formalin-fixed paraffin-embedded (FFPE) tissues, as FFPE DNA contains an excessive number of damaged bases, and these lesions are converted to false double-strand nucleotide substitutions during polymerase-driven DNA end repair process.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!