Polycomb Repressive Complex 2 (PRC2) is an epigenetic regulator required for gene silencing during development. Although PRC2 is a well-established RNA-binding complex, the biological function of PRC2-RNA interaction has been controversial. Here, we study the gene-regulatory role of the inhibitory PRC2-RNA interactions.
View Article and Find Full Text PDFThe study of RNA modifications in large clinical cohorts can reveal relationships between the epitranscriptome and human diseases, although this is especially challenging. We developed ModTect (https://github.com/ktan8/ModTect), a statistical framework to identify RNA modifications de novo by standard RNA-sequencing with deletion and mis-incorporation signals.
View Article and Find Full Text PDFThe mechanism underlying cell type-specific gene induction conferred by ubiquitous transcription factors as well as disruptions caused by their chimeric derivatives in leukemia is not well understood. Here, we investigate whether RNAs coordinate with transcription factors to drive myeloid gene transcription. In an integrated genome-wide approach surveying for gene loci exhibiting concurrent RNA and DNA interactions with the broadly expressed Runt-related transcription factor 1 (RUNX1), we identified the long noncoding RNA (lncRNA) originating from the upstream regulatory element of PU.
View Article and Find Full Text PDFThe zinc finger transcription factor SALL4 is highly expressed in embryonic stem cells, downregulated in most adult tissues, but reactivated in many aggressive cancers. This unique expression pattern makes SALL4 an attractive therapeutic target. However, whether SALL4 binds DNA directly to regulate gene expression is unclear, and many of its targets in cancer cells remain elusive.
View Article and Find Full Text PDFNext-generation sequencing (NGS) has been a widely-used technology in biomedical research for understanding the role of molecular genetics of cells in health and disease. A variety of computational tools have been developed to analyse the vastly growing NGS data, which often require bioinformatics skills, tedious work and a significant amount of time. To facilitate data processing steps minding the gap between biologists and bioinformaticians, we developed CSI NGS Portal, an online platform which gathers established bioinformatics pipelines to provide fully automated NGS data analysis and sharing in a user-friendly website.
View Article and Find Full Text PDF: The development of molecular targeted therapies, such as EGFR-TKIs, has positively impacted the management of EGFR mutated NSCLC. However, patients with innate and acquired resistance to EGFR-TKIs still face limited effective therapeutic options. Statins are the most frequently prescribed anti-cholesterol agents and have been reported to inhibit the progression of various malignancies, including in lung.
View Article and Find Full Text PDFBackground & Aims: Some oncogenes encode transcription factors, but few drugs have been successfully developed to block their activity specifically in cancer cells. The transcription factor SALL4 is aberrantly expressed in solid tumor and leukemia cells. We developed a screen to identify compounds that reduce the viability of liver cancer cells that express high levels of SALL4, and we investigated their mechanisms.
View Article and Find Full Text PDFDNMT3B is known as a de novo DNA methyltransferase. However, its preferential target sites for DNA methylation are largely unknown. Our analysis on ChIP-seq experiment in human embryonic stem cells (hESC) revealed that DNMT3B, mCA and H3K36me3 share the same genomic distribution profile.
View Article and Find Full Text PDFSal-like 4 (SALL4) is a nuclear factor central to the maintenance of stem cell pluripotency and is a key component in hepatocellular carcinoma, a malignancy with no effective treatment. In cancer cells, SALL4 associates with nucleosome remodeling deacetylase (NuRD) to silence tumor-suppressor genes, such as PTEN. Here, we determined the crystal structure of an amino-terminal peptide of SALL4(1-12) complexed to RBBp4, the chaperone subunit of NuRD, at 2.
View Article and Find Full Text PDFMetabolic reprogramming is widely known as a hallmark of cancer cells to allow adaptation of cells to sustain survival signals. In this report, we describe a novel oncogenic signaling pathway exclusively acting in mutated epidermal growth factor receptor (EGFR) non-small cell lung cancer (NSCLC) with acquired tyrosine kinase inhibitor (TKI) resistance. Mutated EGFR mediates TKI resistance through regulation of the fatty acid synthase (FASN), which produces 16-C saturated fatty acid palmitate.
View Article and Find Full Text PDFGenome-wide analysis has observed an excess of coincident single nucleotide polymorphisms (coSNPs) at human-chimpanzee orthologous positions, and suggested that this is due to cryptic variation in the mutation rate. While this phenomenon primarily corresponds with non-coding coSNPs, the situation in coding sequences remains unclear. Here we calculate the observed-to-expected ratio of coSNPs (coSNPO/E) to estimate the prevalence of human-chimpanzee coSNPs, and show that the excess of coSNPs is also present in coding regions.
View Article and Find Full Text PDFAnalysis of RNA-seq data often detects numerous 'non-co-linear' (NCL) transcripts, which comprised sequence segments that are topologically inconsistent with their corresponding DNA sequences in the reference genome. However, detection of NCL transcripts involves two major challenges: removal of false positives arising from alignment artifacts and discrimination between different types of NCL transcripts (trans-spliced, circular or fusion transcripts). Here, we developed a new NCL-transcript-detecting method ('NCLscan'), which utilized a stepwise alignment strategy to almost completely eliminate false calls (>98% precision) without sacrificing true positives, enabling NCLscan outperform 18 other publicly-available tools (including fusion- and circular-RNA-detecting tools) in terms of sensitivity and precision, regardless of the generation strategy of simulated dataset, type of intragenic or intergenic NCL event, read depth of coverage, read length or expression level of NCL transcript.
View Article and Find Full Text PDFTrans-splicing is a post-transcriptional event that joins exons from separate pre-mRNAs. Detection of trans-splicing is usually severely hampered by experimental artifacts and genetic rearrangements. Here, we develop a new computational pipeline, TSscan, which integrates different types of high-throughput long-/short-read transcriptome sequencing of different human embryonic stem cell (hESC) lines to effectively minimize false positives while detecting trans-splicing.
View Article and Find Full Text PDF