PLoS Comput Biol
December 2024
Phylogenies depicting the evolutionary history of genetically heterogeneous subpopulations of cells from the same cancer, i.e., cancer phylogenies, offer valuable insights about cancer development and guide treatment strategies.
View Article and Find Full Text PDFAlternative splicing is essential for plants, enabling a single gene to produce multiple transcript variants to boost functional diversity and fine-tune responses to environmental and developmental cues. At-RS31, a plant-specific splicing factor in the Serine/Arginine (SR)-rich protein family, responds to light and the Target of Rapamycin (TOR) signaling pathway, yet its downstream targets and regulatory impact remain unknown.To identify At-RS31 targets, we applied individual-nucleotide resolution crosslinking and immunoprecipitation (iCLIP) and RNAcompete assays.
View Article and Find Full Text PDFMost of the human genome is thought to be non-functional, and includes large segments often referred to as "dark matter" DNA. The genome also encodes hundreds of putative and poorly characterized transcription factors (TFs). We determined genomic binding locations of 166 uncharacterized human TFs in living cells.
View Article and Find Full Text PDFRNA-binding proteins (RBPs) are key regulators of gene expression. Here, we introduce EuPRI (Eukaryotic Protein-RNA Interactions) - a freely available resource of RNA motifs for 34,736 RBPs from 690 eukaryotes. EuPRI includes binding data for 504 RBPs, including newly collected RNAcompete data for 174 RBPs, along with thousands of reconstructed motifs.
View Article and Find Full Text PDFIn the face of rapidly accumulating genomic data, our ability to accurately predict key mature RNA properties that underlie transcript function and regulation remains limited. Pre-trained genomic foundation models offer an avenue to adapt learned RNA representations to biological prediction tasks. However, existing genomic foundation models are trained using strategies borrowed from textual or visual domains that do not leverage biological domain knowledge.
View Article and Find Full Text PDFThe metastatic spread of a cancer can be reconstructed from DNA sequencing of primary and metastatic tumours, but doing so requires solving a challenging combinatorial optimization problem. This problem often has multiple solutions that cannot be distinguished based on current maximum parsimony principles alone. Current algorithms use ad hoc criteria to select among these solutions, and decide, a priori, what patterns of metastatic spread are more likely, which is itself a key question posed by studies of metastasis seeking to use these tools.
View Article and Find Full Text PDFSubclonal reconstruction algorithms use bulk DNA sequencing data to quantify parameters of tumor evolution, allowing an assessment of how cancers initiate, progress and respond to selective pressures. We launched the ICGC-TCGA (International Cancer Genome Consortium-The Cancer Genome Atlas) DREAM Somatic Mutation Calling Tumor Heterogeneity and Evolution Challenge to benchmark existing subclonal reconstruction algorithms. This 7-year community effort used cloud computing to benchmark 31 subclonal reconstruction algorithms on 51 simulated tumors.
View Article and Find Full Text PDFThe basal breast cancer subtype is enriched for triple-negative breast cancer (TNBC) and displays consistent large chromosomal deletions. Here, we characterize evolution and maintenance of chromosome 4p (chr4p) loss in basal breast cancer. Analysis of The Cancer Genome Atlas data shows recurrent deletion of chr4p in basal breast cancer.
View Article and Find Full Text PDFStem cells regulate their self-renewal and differentiation fate outcomes through both symmetric and asymmetric divisions. mA RNA methylation controls symmetric commitment and inflammation of hematopoietic stem cells (HSCs) through unknown mechanisms. Here, we demonstrate that the nuclear speckle protein SON is an essential mA target required for murine HSC self-renewal, symmetric commitment, and inflammation control.
View Article and Find Full Text PDFTumor type guides clinical treatment decisions in cancer, but histology-based diagnosis remains challenging. Genomic alterations are highly diagnostic of tumor type, and tumor type classifiers trained on genomic features have been explored, but the most accurate methods are not clinically feasible, relying on features derived from whole genome sequencing (WGS), or predicting across limited cancer types. We use genomic features from a dataset of 39,787 solid tumors sequenced using a clinical targeted cancer gene panel to develop Genome-Derived-Diagnosis Ensemble (GDD-ENS): a hyperparameter ensemble for classifying tumor type using deep neural networks.
View Article and Find Full Text PDFThe mRNA 3' poly(A) tail plays a critical role in regulating both mRNA translation and turnover. It is bound by the cytoplasmic poly(A) binding protein (PABPC), an evolutionarily conserved protein that can interact with translation factors and mRNA decay machineries to regulate gene expression. Mammalian PABPC1, the prototypical PABPC, is expressed in most tissues and interacts with eukaryotic translation initiation factor 4G (eIF4G) to stimulate translation in specific contexts.
View Article and Find Full Text PDFThousands of RNA-binding proteins (RBPs) crosslink to cellular mRNA. Among these are numerous unconventional RBPs (ucRBPs)-proteins that associate with RNA but lack known RNA-binding domains (RBDs). The vast majority of ucRBPs have uncharacterized RNA-binding specificities.
View Article and Find Full Text PDFCancer genomes harbor a catalog of somatic mutations. The type and genomic context of these mutations depend on their causes and allow their attribution to particular mutational signatures. Previous work has shown that mutational signature activities change over the course of tumor development, but investigations of genomic region variability in mutational signatures have been limited.
View Article and Find Full Text PDFPairtree is a clone tree reconstruction algorithm that uses somatic point mutations to build clone trees describing the evolutionary history of individual cancers. Using the Pairtree software package, we describe steps to preprocess somatic mutation data, cluster mutations into subclones, search for clone trees, and visualize clone trees. Pairtree builds clone trees using up to 100 samples from a single cancer with at least 30 subclonal populations.
View Article and Find Full Text PDFThe coronavirus disease 2019 (COVID-19) pandemic has caused millions of deaths around the world and revealed the need for data-driven models of pandemic spread. Accurate pandemic caseload forecasting allows informed policy decisions on the adoption of non-pharmaceutical interventions (NPIs) to reduce disease transmission. Using COVID-19 as an example, we present Pandemic conditional Ordinary Differential Equation (PAN-cODE), a deep learning method to forecast daily increases in pandemic infections and deaths.
View Article and Find Full Text PDFModelling both primary sequence and secondary structure preferences for RNA binding proteins (RBPs) remains an ongoing challenge. Current models use varied RNA structure representations and can be difficult to interpret and evaluate. To address these issues, we present a universal RNA motif-finding/scanning strategy, termed PRIESSTESS (Predictive RBP-RNA InterpretablE Sequence-Structure moTif regrESSion), that can be applied to diverse RNA binding datasets.
View Article and Find Full Text PDFBackground And Objectives: Recent literature has highlighted the role of the host in the prognosis of oral squamous cell carcinoma (OSCC). In this study, we retrospectively examined the impact of autoimmune (AI) disorders as an aspect of the host status on survival outcomes in OSCC patients.
Methods: From a departmental database of OSCC patients (n = 1369), 123 patients with an AI disorder were identified.
Unlabelled: The mTOR is a key regulator of cell growth that integrates growth factor signaling and nutrient availability and is a downstream effector of oncogenic receptor tyrosine kinases (RTK) and PI3K/Akt signaling. Thus, activating mTOR mutations would be expected to enhance growth in many tumor types. However, tumor sequencing data have shown that mTOR mutations are enriched only in renal clear cell carcinoma, a clinically hypervascular tumor unlikely to be constrained by nutrient availability.
View Article and Find Full Text PDFUnderstanding the regulatory interactions that control gene expression during the development of novel tissues is a key goal of evolutionary developmental biology. Here, we show that Mbnl3 has undergone a striking process of evolutionary specialization in eutherian mammals resulting in the emergence of a novel placental function for the gene. Mbnl3 belongs to a family of RNA-binding proteins whose members regulate multiple aspects of RNA metabolism.
View Article and Find Full Text PDFUnlabelled: Cancers are composed of genetically distinct subpopulations of malignant cells. DNA-sequencing data can be used to determine the somatic point mutations specific to each population and build clone trees describing the evolutionary relationships between them. These clone trees can reveal critical points in disease development and inform treatment.
View Article and Find Full Text PDF