Selfish genetic elements contribute to hybrid incompatibility and bias or 'drive' their own transmission. Chromosomal drive typically functions in asymmetric female meiosis, whereas gene drive is normally post-meiotic and typically found in males. Here, using single-molecule and single-pollen genome sequencing, we describe Teosinte Pollen Drive, an instance of gene drive in hybrids between maize (Zea mays ssp.
View Article and Find Full Text PDFJ Mammary Gland Biol Neoplasia
January 2024
During female adolescence and pregnancy, rising levels of hormones result in a cyclic source of signals that control the development of mammary tissue. While such alterations are well understood from a whole-gland perspective, the alterations that such hormones bring to organoid cultures derived from mammary glands have yet to be fully mapped. This is of special importance given that organoids are considered suitable systems to understand cross species breast development.
View Article and Find Full Text PDFAcross all branches of life, transcription elongation is a crucial, regulated phase in gene expression. Many recent studies in eukaryotes have focused on the regulation of promoter-proximal pausing of RNA Polymerase II (Pol II), but rates of productive elongation also vary substantially throughout the gene body, both within and across genes. Here, we introduce a probabilistic model for systematically evaluating potential determinants of the local elongation rate based on nascent RNA sequencing (NRS) data.
View Article and Find Full Text PDFAlphaMissense is a recently developed method that is designed to classify missense variants into pathogenic, benign, or ambiguous categories across the entire human proteome. Asparagine Synthetase Deficiency (ASNSD) is a developmental disorder associated with severe symptoms, including congenital microcephaly, seizures, and premature death. Diagnosing ASNSD relies on identifying mutations in the asparagine synthetase (ASNS) gene through DNA sequencing and determining whether these variants are pathogenic or benign.
View Article and Find Full Text PDFInvestigators have recently introduced powerful methods for population genetic inference that rely on supervised machine learning from simulated data. Despite their performance advantages, these methods can fail when the simulated training data does not adequately resemble data from the real world. Here, we show that this "simulation mis-specification" problem can be framed as a "domain adaptation" problem, where a model learned from one data distribution is applied to a dataset drawn from a different distribution.
View Article and Find Full Text PDFIn metazoans, both transcription initiation and the escape of RNA polymerase (RNAP) from promoter-proximal pausing are key rate-limiting steps in gene expression. These processes play out at physically proximal sites on the DNA template and appear to influence one another through steric interactions. Here, we examine the dynamics of these processes using a combination of statistical modeling, simulation, and analysis of real nascent RNA sequencing data.
View Article and Find Full Text PDFDuring female adolescence and pregnancy, rising levels of hormones result in a cyclic source of signals that control the development of mammary tissue. While such alterations are well understood from a whole-gland perspective, the alterations that such hormones bring to organoid cultures derived from mammary glands have yet to be fully mapped. This is of special importance given that organoids are considered suitable systems to understand cross species breast development.
View Article and Find Full Text PDFBats are exceptional among mammals for their powered flight, extended lifespans, and robust immune systems and therefore have been of particular interest in comparative genomics. Using the Oxford Nanopore Technologies long-read platform, we sequenced the genomes of two bat species with key phylogenetic positions, the Jamaican fruit bat (Artibeus jamaicensis) and the Mesoamerican mustached bat (Pteronotus mesoamericanus), and carried out a comprehensive comparative genomic analysis with a diverse collection of bats and other mammals. The high-quality, long-read genome assemblies revealed a contraction of interferon (IFN)-α at the immunity-related type I IFN locus in bats, resulting in a shift in relative IFN-ω and IFN-α copy numbers.
View Article and Find Full Text PDFMeiotic drivers subvert Mendelian expectations by manipulating reproductive development to bias their own transmission. Chromosomal drive typically functions in asymmetric female meiosis, while gene drive is normally postmeiotic and typically found in males. Using single molecule and single-pollen genome sequencing, we describe , an instance of gene drive in hybrids between maize () and teosinte (), that depends on RNA interference (RNAi).
View Article and Find Full Text PDFInvestigators have recently introduced powerful methods for population genetic inference that rely on supervised machine learning from simulated data. Despite their performance advantages, these methods can fail when the simulated training data does not adequately resemble data from the real world. Here, we show that this "simulation mis-specification" problem can be framed as a "domain adaptation" problem, where a model learned from one data distribution is applied to a dataset drawn from a different distribution.
View Article and Find Full Text PDFPromoter-proximal RNA Pol II pausing is a critical step in transcriptional control. Pol II pausing has been predominantly studied in tissue culture systems. While Pol II pausing has been shown to be required for mammalian development, the phenotypic and mechanistic details of this requirement are unknown.
View Article and Find Full Text PDFLarge-scale genome sequencing has enabled the measurement of strong purifying selection in protein-coding genes. Here we describe a new method, called ExtRaINSIGHT, for measuring such selection in noncoding as well as coding regions of the human genome. ExtRaINSIGHT estimates the prevalence of "ultraselection" by the fractional depletion of rare single-nucleotide variants, after controlling for variation in mutation rates.
View Article and Find Full Text PDFAlignments of multiple genomes are a cornerstone of comparative genomics, but generating these alignments remains technically challenging and often impractical. We developed the msa_pipeline workflow (https://bitbucket.org/bucklerlab/msa_pipeline) to allow practical and sensitive multiple alignment of diverged plant genomes and calculation of conservation scores with minimal user inputs.
View Article and Find Full Text PDFDetecting signals of selection from genomic data is a central problem in population genetics. Coupling the rich information in the ancestral recombination graph (ARG) with a powerful and scalable deep-learning framework, we developed a novel method to detect and quantify positive selection: Selection Inference using the Ancestral recombination graph (SIA). Built on a Long Short-Term Memory (LSTM) architecture, a particular type of a Recurrent Neural Network (RNN), SIA can be trained to explicitly infer a full range of selection coefficients, as well as the allele frequency trajectory and time of selection onset.
View Article and Find Full Text PDFHigh-throughput CRISPR-Cas9 knockout screens are widely used to evaluate gene essentiality in cancer research. Here we introduce a probabilistic modeling framework, Analysis of CRISPR-based Essentiality (ACE), that accounts for multiple sources of variation in CRISPR-Cas9 screens and enables new statistical tests for essentiality. We show using simulations that ACE is effective at predicting both absolute and differential essentiality.
View Article and Find Full Text PDFMotivation: Quantification of isoform abundance has been extensively studied at the mature RNA level using RNA-seq but not at the level of precursor RNAs using nascent RNA sequencing.
Results: We address this problem with a new computational method called Deconvolution of Expression for Nascent RNA-sequencing data (DENR), which models nascent RNA-sequencing read-counts as a mixture of user-provided isoforms. The baseline algorithm is enhanced by machine-learning predictions of active transcription start sites and an adjustment for the typical 'shape profile' of read-counts along a transcription unit.
The developing mammary gland depends on several transcription-dependent networks to define cellular identities and differentiation trajectories. Recent technological advancements that allow for single-cell profiling of gene expression have provided an initial picture into the epithelial cellular heterogeneity across the diverse stages of gland maturation. Still, a deeper dive into expanded molecular signatures would improve our understanding of the diversity of mammary epithelial and non-epithelial cellular populations across different tissue developmental stages, mouse strains and mammalian species.
View Article and Find Full Text PDFThe innate and adaptive immune response are regulated by biological clocks, and circulating lymphocytes are lowest at sunrise. Accordingly, severity of disease in mouse models is highly dependent on the time of day of viral infection. Here, we explore whether circadian immunity contributes significantly to seasonality of respiratory viruses, including influenza and SARS-CoV-2.
View Article and Find Full Text PDFBackground: The concentrations of distinct types of RNA in cells result from a dynamic equilibrium between RNA synthesis and decay. Despite the critical importance of RNA decay rates, current approaches for measuring them are generally labor-intensive, limited in sensitivity, and/or disruptive to normal cellular processes. Here, we introduce a simple method for estimating relative RNA half-lives that is based on two standard and widely available high-throughput assays: Precision Run-On sequencing (PRO-seq) and RNA sequencing (RNA-seq).
View Article and Find Full Text PDFNumerous studies of emerging species have identified genomic "islands" of elevated differentiation against a background of relative homogeneity. The causes of these islands remain unclear, however, with some signs pointing toward "speciation genes" that locally restrict gene flow and others suggesting selective sweeps that have occurred within nascent species after speciation. Here, we examine this question through the lens of genome sequence data for five species of southern capuchino seedeaters, finch-like birds from South America that have undergone a species radiation during the last ∼50,000 generations.
View Article and Find Full Text PDFThe sequencing of Neanderthal and Denisovan genomes has yielded many new insights about interbreeding events between extinct hominins and the ancestors of modern humans. While much attention has been paid to the relatively recent gene flow from Neanderthals and Denisovans into modern humans, other instances of introgression leave more subtle genomic evidence and have received less attention. Here, we present a major extension of the ARGweaver algorithm, called ARGweaver-D, which can infer local genetic relationships under a user-defined demographic model that includes population splits and migration events.
View Article and Find Full Text PDF