Introns are highly prevalent in most eukaryotic genomes. Despite the accumulating evidence for benefits conferred by the possession of introns, their specific roles and functions, as well as the processes shaping their evolution, are still only partially understood. Here, we explore the evolution of the eukaryotic intron-exon gene structure by focusing on several key features such as the intron length, the number of introns, and the intron-to-exon length ratio in protein-coding genes.
View Article and Find Full Text PDFInsertions and deletions constitute the second most important source of natural genomic variation. Insertions and deletions make up to 25% of genomic variants in humans and are involved in complex evolutionary processes including genomic rearrangements, adaptation, and speciation. Recent advances in long-read sequencing technologies allow detailed inference of insertions and deletion variation in species and populations.
View Article and Find Full Text PDFSequencing the mitochondrial genome of the tunicate Oikopleura dioica is a challenging task due to the presence of long poly-A/T homopolymer stretches, which impair sequencing and assembly. Here, we report on the sequencing and annotation of the majority of the mitochondrial genome of O. dioica by means of combining several DNA and amplicon reads obtained by Illumina and MinIon Oxford Nanopore Technologies with public RNA sequences.
View Article and Find Full Text PDFMotivation: Currently used methods for estimating branch support in phylogenetic analyses often rely on the classic Felsenstein's bootstrap, parametric tests, or their approximations. As these branch support scores are widely used in phylogenetic analyses, having accurate, fast, and interpretable scores is of high importance.
Results: Here, we employed a data-driven approach to estimate branch support values with a probabilistic interpretation.
The computational search for the maximum-likelihood phylogenetic tree is an NP-hard problem. As such, current tree search algorithms might result in a tree that is the local optima, not the global one. Here, we introduce a paradigm shift for predicting the maximum-likelihood tree, by approximating long-term gains of likelihood rather than maximizing likelihood gain at each step of the search.
View Article and Find Full Text PDFBackground: Myxozoa is a class of cnidarian parasites that encompasses over 2,400 species. Phylogenetic relationships among myxozoans remain highly debated, owing to both a lack of informative morphological characters and a shortage of molecular markers. Mitochondrial (mt) genomes are a common marker in phylogeny and biogeography.
View Article and Find Full Text PDFThe genus Xanthomonas has been primarily studied for pathogenic interactions with plants. However, besides host and tissue-specific pathogenic strains, this genus also comprises nonpathogenic strains isolated from a broad range of hosts, sometimes in association with pathogenic strains, and other environments, including rainwater. Based on their incapacity or limited capacity to cause symptoms on the host of isolation, nonpathogenic xanthomonads can be further characterized as commensal and weakly pathogenic.
View Article and Find Full Text PDFMotivation: Deep-learning models are transforming biological research, including many bioinformatics and comparative genomics algorithms, such as sequence alignments, phylogenetic tree inference, and automatic classification of protein functions. Among these deep-learning algorithms, models for processing natural languages, developed in the natural language processing (NLP) community, were recently applied to biological sequences. However, biological sequences are different from natural languages, such as English, and French, in which segmentation of the text to separate words is relatively straightforward.
View Article and Find Full Text PDFMotivation: Insertions and deletions (indels) of short DNA segments, along with substitutions, are the most frequent molecular evolutionary events. Indels were shown to affect numerous macro-evolutionary processes. Because indels may span multiple positions, their impact is a product of both their rate and their length distribution.
View Article and Find Full Text PDFBeing able to accurately predict the three-dimensional structure of an Ab can facilitate Ab characterization and epitope prediction, with important diagnostic and clinical implications. In this study, we evaluated the ability of AlphaFold to predict the structures of 222 recently published, high-resolution Fab H and L chain structures of Abs from different species directed against different Ags. We show that although the overall Ab prediction quality is in line with the results of CASP14, regions such as the complementarity-determining regions (CDRs) of the H chain, which are prone to higher variation, are predicted less accurately.
View Article and Find Full Text PDFAcquisition of the pathogenicity plasmid pPATH that encodes a type III secretion system (T3SS) and effectors (T3Es) has likely led to the transition of a non-pathogenic bacterium into the tumorigenic pathogen . pv. () forms galls on gypsophila () and triggers immunity on sugar beet (), while pv.
View Article and Find Full Text PDFMultiple sequence alignments (MSAs) are the workhorse of molecular evolution and structural biology research. From MSAs, the amino acids that are tolerated at each site during protein evolution can be inferred. However, little is known regarding the repertoire of tolerated amino acids in proteins when only a few or no sequence homologs are available, such as orphan and de novo designed proteins.
View Article and Find Full Text PDFpv. pelargonii is the causative agent of bacterial blight in geranium ornamental plants, the most threatening bacterial disease of this plant worldwide. is the causative agent of angular leaf spot in strawberries, where it poses a significant threat to the strawberry industry.
View Article and Find Full Text PDFIn the last decade, advances in sequencing technology have led to an exponential increase in genomic data. These new data have dramatically changed our understanding of the evolution and function of genes and genomes. Despite improvements in sequencing technologies, identifying contaminated reads remains a complex task for many research groups.
View Article and Find Full Text PDFThe ConSurf web-sever for the analysis of proteins, RNA, and DNA provides a quick and accurate estimate of the per-site evolutionary rate among homologues. The analysis reveals functionally important regions, such as catalytic and ligand-binding sites, which often evolve slowly. Since the last report in 2016, ConSurf has been improved in multiple ways.
View Article and Find Full Text PDFInsertions and deletions (indels) of short DNA segments are common evolutionary events. Numerous studies showed that deletions occur more often than insertions in both prokaryotes and eukaryotes. It raises the question why neutral sequences are not eradicated from the genome.
View Article and Find Full Text PDFType III effectors are proteins injected by Gram-negative bacteria into eukaryotic hosts. In many plant and animal pathogens, these effectors manipulate host cellular processes to the benefit of the bacteria. Type III effectors are secreted by a type III secretion system that must "classify" each bacterial protein into one of two categories, either the protein should be translocated or not.
View Article and Find Full Text PDFThe inference of genome rearrangement events has been extensively studied, as they play a major role in molecular evolution. However, probabilistic evolutionary models that explicitly imitate the evolutionary dynamics of such events, as well as methods to infer model parameters, are yet to be fully utilized. Here, we developed a probabilistic approach to infer genome rearrangement rate parameters using an Approximate Bayesian Computation (ABC) framework.
View Article and Find Full Text PDFMeasuring evolutionary rates at the residue level is indispensable for gaining structural and functional insights into proteins. State-of-the-art tools for estimating rates take as input a large set of homologous proteins, a probabilistic model of evolution and a phylogenetic tree. However, a gap exists when only few or no homologous proteins can be found, e.
View Article and Find Full Text PDFVarious Gram-negative bacteria use secretion systems to secrete effector proteins that manipulate host biochemical pathways to their benefit. We and others have previously developed machine-learning algorithms to predict novel effectors. Specifically, given a set of known effectors and a set of known non-effectors, the machine-learning algorithm extracts features that distinguish these two protein groups.
View Article and Find Full Text PDFConservation is a strong predictor for the pathogenicity of single-nucleotide variants (SNVs). However, some positions that present complex conservation patterns across vertebrates stray from this paradigm. Here, we analyzed the association between complex conservation patterns and the pathogenicity of SNVs in the 115 disease-genes that had sufficient variant data.
View Article and Find Full Text PDFMotivation: Type-III secretion systems are utilized by many Gram-negative bacteria to inject type-3 effectors (T3Es) to eukaryotic cells. These effectors manipulate host processes for the benefit of the bacteria and thus promote disease. They can also function as host-specificity determinants through their recognition as avirulence proteins that elicit immune response.
View Article and Find Full Text PDF