Mobile genetic elements are key to the global emergence of antibiotic resistance. We successfully reconstructed the complete bacterial genome and plasmid assemblies of isolates sharing the same carbapenemase gene to understand evolution over time in six confined hospital drains over five years. From 82 isolates we identified 14 unique strains from 10 species with 113 carrying plasmids across 16 distinct replicon types.
View Article and Find Full Text PDFWe present a novel framework enhancing the prediction of whether novel lineage poses the threat of eventually dominating the viral population. The framework is based purely on genomic sequence data, without requiring prior established biological analysis. Its building blocks are sets of coevolving sites in the alignment (motifs), identified via coevolutionary signals.
View Article and Find Full Text PDFMathematical models rooted in network representations are becoming increasingly more common for capturing a broad range of phenomena. Boolean networks (BNs) represent a mathematical abstraction suited for establishing general theory applicable to such systems. A key thread in BN research is developing theory that connects the structure of the network and the local rules to phase space properties or so-called structure-to-function theory.
View Article and Find Full Text PDFThe study of native motifs of RNA secondary structures helps us better understand the formation and eventually the functions of these molecules. Commonly known structural motifs include helices, hairpin loops, bulges, interior loops, exterior loops and multiloops. However, enumerative results and generating algorithms taking into account the joint distribution of these motifs are sparse.
View Article and Find Full Text PDFWe propose a novel mathematical paradigm for the study of genetic variation in sequence alignments. This framework originates from extending the notion of pairwise relations, upon which current analysis is based on, to k-ary dissimilarity. This dissimilarity naturally leads to a generalization of simplicial complexes by endowing simplices with weights, compatible with the boundary operator.
View Article and Find Full Text PDFOn the occasion of Dr. Michael Waterman's 80th birthday, we review his major contributions to the field of computational biology and bioinformatics including the famous Smith-Waterman algorithm for sequence alignment, the probability and statistics theory related to sequence alignment, algorithms for sequence assembly, the Lander-Waterman model for genome physical mapping, combinatorics and predictions of ribonucleic acid structures, word counting statistics in molecular sequences, alignment-free sequence comparison, and algorithms for haplotype block partition and tagSNP selection related to the International HapMap Project. His books for graduate students and geared toward undergraduate students played key roles in computational biology and bioinformatics education.
View Article and Find Full Text PDFBackground: Genotype-phenotype maps provide a meaningful filtration of sequence space and RNA secondary structures are particular such phenotypes. Compatible sequences, which satisfy the base-pairing constraints of a given RNA structure, play an important role in the context of neutral evolution. Sequences that are simultaneously compatible with two given structures (bicompatible sequences), are beacons in phenotypic transitions, induced by erroneously replicating populations of RNA sequences.
View Article and Find Full Text PDFCOVID-19 is an infectious disease caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The viral genome is considered to be relatively stable and the mutations that have been observed and reported thus far are mainly focused on the coding region. This article provides evidence that macrolevel pandemic dynamics, such as social distancing, modulate the genomic evolution of SARS-CoV-2.
View Article and Find Full Text PDFIdentifying the secondary structure of an RNA is crucial for understanding its diverse regulatory functions. This paper focuses on how to enhance target identification in a Boltzmann ensemble of structures via chemical probing data. We employ an information-theoretic approach to solve the problem, via considering a variant of the Rényi-Ulam game.
View Article and Find Full Text PDFGenetic robustness, the preservation of evolved phenotypes against genotypic mutations, is one of the central concepts in evolution. In recent years a large body of work has focused on the origins, mechanisms, and consequences of robustness in a wide range of biological systems. In particular, research on ncRNAs studied the ability of sequences to maintain folded structures against single-point mutations.
View Article and Find Full Text PDFIn this paper we analyze the length-spectrum of blocks in [Formula: see text]-structures. [Formula: see text]-structures are a class of RNA pseudoknot structures that play a key role in the context of polynomial time RNA folding. A [Formula: see text]-structure is constructed by nesting and concatenating specific building components having topological genus at most [Formula: see text].
View Article and Find Full Text PDFRecently, a framework considering RNA sequences and their RNA secondary structures as pairs led to some information-theoretic perspectives on how the semantics encoded in RNA sequences can be inferred. This pairing arises naturally from the energy model of RNA secondary structures. Fixing the sequence in the pairing produces the RNA energy landscape, whose partition function was discovered by McCaskill.
View Article and Find Full Text PDFBackground: The neutral theory of Motoo Kimura stipulates that evolution is mostly driven by neutral mutations. However adaptive pressure eventually leads to changes in phenotype that involve non-neutral mutations. The relation between neutrality and adaptation has been studied in the context of RNA before and here we further study transitional mutations in the context of degenerate (plastic) RNA sequences and genetic assimilation.
View Article and Find Full Text PDFRecently, a framework considering ribonucleic acid (RNA) sequences and their RNA secondary structures as pairs has led to new information theoretic perspectives on how the semantics encoded in RNA sequences can be inferred. In this context, the pairing arises naturally from the energy model of RNA secondary structures. Fixing the sequence in the pairing produces the RNA energy landscape, whose partition function was discovered by McCaskill.
View Article and Find Full Text PDFMotivation Motoo Kimura's neutral theory stipulates that evolution is predominantly driven by neutral mutations. RNA, realizing both genotype (its linear sequence of nucleotides) as well as phenotype (its folded secondary structure) represents a particularly well suited test bed for studying neutrality. This leads to neutral networks of RNA secondary structures, i.
View Article and Find Full Text PDFIn this paper, we analyze the length spectrum of rainbows in RNA secondary structures. A rainbow in a secondary structure is a maximal arc with respect to the partial order induced by nesting. We show that there is a significant gap in this length spectrum.
View Article and Find Full Text PDFMotivation: DNA data is transcribed into single-stranded RNA, which folds into specific molecular structures. In this paper we pose the question to what extent sequence- and structure-information correlate. We view this correlation as structural semantics of sequence data that allows for a different interpretation than conventional sequence alignment.
View Article and Find Full Text PDFIn this paper we study properties of topological RNA structures, i.e. RNA contact structures with cross-serial interactions that are filtered by their topological genus.
View Article and Find Full Text PDFUnlabelled: In this paper we introduce a novel, context-free grammar, RNAFeatures, capable of generating any RNA structure including pseudoknot structures (pk-structure). We represent pk-structures as orientable fatgraphs, which naturally leads to a filtration by their topological genus. Within this framework, RNA secondary structures correspond to pk-structures of genus zero.
View Article and Find Full Text PDFGiven a random RNA secondary structure, S, we study RNA sequences having fixed ratios of nucleotides that are compatible with S. We perform this analysis for RNA secondary structures subject to various base-pairing rules and minimum arc- and stack-length restrictions. Our main result reads as follows: in the simplex of nucleotide ratios, there exists a convex region, in which, in the limit of long sequences, a random structure asymptotically almost surely (a.
View Article and Find Full Text PDFA topological RNA structure is derived by fattening the edges of a contact structure into ribbons. The shape of a topological RNA structure is obtained by collapsing the stacks of the structure into single arcs and by removing any arcs of length one, as well as isolated vertices. A shape contains the key topological information of the molecular conformation and for fixed topological genus there exist only finitely many such shapes.
View Article and Find Full Text PDFInteracting RNA complexes are studied via bicellular maps using a filtration via their topological genus. Our main result is a new bijection for RNA-RNA interaction structures and a linear time uniform sampling algorithm for RNA complexes of fixed topological genus. The bijection allows to either reduce the topological genus of a bicellular map directly, or to lose connectivity by decomposing the complex into a pair of single stranded RNA structures.
View Article and Find Full Text PDFShapes of interacting RNA complexes are studied using a filtration via their topological genus. A shape of an RNA complex is obtained by (iteratively) collapsing stacks and eliminating hairpin loops. This shape projection preserves the topological core of the RNA complex, and for fixed topological genus there are only finitely many such shapes.
View Article and Find Full Text PDFIn this article we study canonical γ-structures, a class of RNA pseudoknot structures that plays a key role in the context of polynomial time folding of RNA pseudoknot structures. A γ-structure is composed of specific building blocks that have topological genus less than or equal to γ, where composition means concatenation and nesting of such blocks. Our main result is the derivation of the generating function of γ-structures via symbolic enumeration using so called irreducible shadows.
View Article and Find Full Text PDFIn this paper we present a sampling framework for RNA structures of fixed topological genus. We introduce a novel, linear time, uniform sampling algorithm for RNA structures of fixed topological genus g, for arbitrary g>0. Furthermore we develop a linear time sampling algorithm for RNA structures of fixed topological genus g that are weighted by a simplified, loop-based energy functional.
View Article and Find Full Text PDF