RNA sequencing technology combining short read and long read analysis can be used to detect chimeric RNAs in malignant cells. Here, we propose an integrated approach that uses k-mers to analyze indexed datasets. This approach is used to identify chimeric RNA in chronic myelomonocytic leukemia (CMML) cells, a myeloid malignancy that associates features of myelodysplastic and myeloproliferative neoplasms.
View Article and Find Full Text PDFAnaplastic large cell lymphomas associated with ALK translocation have a good outcome after CHOP treatment; however, the 2-year relapse rate remains at 30%. Microarray gene-expression profiling of 48 samples obtained at diagnosis was used to identify 47 genes that were differentially expressed between patients with early relapse/progression and no relapse. In the relapsing group, the most significant overrepresented genes were related to the regulation of the immune response and T-cell activation while those in the non-relapsing group were involved in the extracellular matrix.
View Article and Find Full Text PDFThe huge body of publicly available RNA-sequencing (RNA-seq) libraries is a treasure of functional information allowing to quantify the expression of known or novel transcripts in tissues. However, transcript quantification commonly relies on alignment methods requiring a lot of computational resources and processing time, which does not scale easily to large datasets. -mer decomposition constitutes a new way to process RNA-seq data for the identification of transcriptional signatures, as -mers can be used to quantify accurately gene expression in a less resource-consuming way.
View Article and Find Full Text PDFBackground: The development of RNA sequencing (RNAseq) and the corresponding emergence of public datasets have created new avenues of transcriptional marker search. The long non-coding RNAs (lncRNAs) constitute an emerging class of transcripts with a potential for high tissue specificity and function. Therefore, we tested the biomarker potential of lncRNAs on Mesenchymal Stem Cells (MSCs), a complex type of adult multipotent stem cells of diverse tissue origins, that is frequently used in clinics but which is lacking extensive characterization.
View Article and Find Full Text PDFGenomic integrity of human pluripotent stem cells (hPSCs) is essential for research and clinical applications. However, genetic abnormalities can accumulate during hPSC generation and routine culture and following gene editing. Their occurrence should be regularly monitored, but the current assays to assess hPSC genomic integrity are not fully suitable for such regular screening.
View Article and Find Full Text PDFHigh-throughput next generation sequencing (NGS) technologies enable the detection of biomarkers used for tumor classification, disease monitoring and cancer therapy. Whole-transcriptome analysis using RNA-seq is important, not only as a means of understanding the mechanisms responsible for complex diseases but also to efficiently identify novel genes/exons, splice isoforms, RNA editing, allele-specific mutations, differential gene expression and fusion-transcripts or chimeric RNA (chRNA). We used Crac, a tool that uses genomic locations and local coverage to classify biological events and directly infer splice and chimeric junctions within a single read.
View Article and Find Full Text PDFRNA-Seq approach enables the detection and characterization of fusion or chimeric transcript associated to complex genome rearrangement. Until now, these events are classically identified at DNA level.Here we describe a complete procedure including a novel way of analyzing reads that combines genomic locations and local coverage to directly infer chimeric junctions with a high sensitivity and specificity, allowing identification of different classes of chimeric RNA events.
View Article and Find Full Text PDFProgress in assisted reproductive technologies strongly relies on understanding the regulation of the dialogue between oocyte and cumulus cells (CCs). Little is known about the role of long non-coding RNAs (lncRNAs) in the human cumulus-oocyte complex (COC). To this aim, publicly available RNA-sequencing data were analyzed to identify lncRNAs that were abundant in metaphase II (MII) oocytes (BCAR4, C3orf56, TUNAR, OOEP-AS1, CASC18, and LINC01118) and CCs (NEAT1, MALAT1, ANXA2P2, MEG3, IL6STP1, and VIM-AS1).
View Article and Find Full Text PDFWe introduce a k-mer-based computational protocol, DE-kupl, for capturing local RNA variation in a set of RNA-seq libraries, independently of a reference genome or transcriptome. DE-kupl extracts all k-mers with differential abundance directly from the raw data files. This enables the retrieval of virtually all variation present in an RNA-seq data set.
View Article and Find Full Text PDFUnlabelled: Surgery and cisplatin-based treatment of hepatoblastoma (HB) currently guarantee the survival of 70%-80% of patients. However, some important challenges remain in diagnosing high-risk tumors and identifying relevant targetable pathways offering new therapeutic avenues. Previously, two molecular subclasses of HB tumors have been described, C1 and C2, with C2 being the subgroup with the poorest prognosis, a more advanced tumor stage, and the worst overall survival rate.
View Article and Find Full Text PDFBackground: The evolution of next-generation sequencing (NGS) technologies has led to increased focus on RNA-Seq. Many bioinformatic tools have been developed for RNA-Seq analysis, each with unique performance characteristics and configuration parameters. Users face an increasingly complex task in understanding which bioinformatic tools are best for their specific needs and how they should be configured.
View Article and Find Full Text PDFBackground: High-throughput sequencing technology and bioinformatics have identified chimeric RNAs (chRNAs), raising the possibility of chRNAs expressing particularly in diseases can be used as potential biomarkers in both diagnosis and prognosis.
Results: The task of discriminating true chRNAs from the false ones poses an interesting Machine Learning (ML) challenge. First of all, the sequencing data may contain false reads due to technical artifacts and during the analysis process, bioinformatics tools may generate false positives due to methodological biases.
The cytidine analogues azacytidine and 5-aza-2'-deoxycytidine (decitabine) are commonly used to treat myelodysplastic syndromes, with or without a myeloproliferative component. It remains unclear whether the response to these hypomethylating agents results from a cytotoxic or an epigenetic effect. In this study, we address this question in chronic myelomonocytic leukaemia.
View Article and Find Full Text PDFDespite the improvement in treatment options, chronic lymphocytic leukemia (CLL) remains an incurable disease and patients show a heterogeneous clinical course requiring therapy for many of them. In the current work, we have built a 20-gene expression (GE)-based risk score predictive for patients overall survival and improving risk classification using microarray gene expression data. GE-based risk score allowed identifying a high-risk group associated with a significant shorter overall survival (OS) and time to treatment (TTT) (P ≤ .
View Article and Find Full Text PDFRecent sequencing technologies that allow massive parallel production of short reads are the method of choice for transcriptome analysis. Particularly, digital gene expression (DGE) technologies produce a large dynamic range of expression data by generating short tag signatures for each cell transcript. These tags can be mapped back to a reference genome to identify new transcribed regions that can be further covered by RNA-sequencing (RNA-Seq) reads.
View Article and Find Full Text PDFA large number of RNA-sequencing studies set out to predict mutations, splice junctions or fusion RNAs. We propose a method, CRAC, that integrates genomic locations and local coverage to enable such predictions to be made directly from RNA-seq read analysis. A k-mer profiling approach detects candidate mutations, indels and splice or chimeric junctions in each single read.
View Article and Find Full Text PDFThe N-myc downstream regulated gene 1 (NDRG1) has been identified as a metastasis-suppressor gene in prostate cancer (PCa). Compounds targeting PCa cells deficient in NDRG1 could potentially decrease invasion/metastasis of PCa. A cell based screening strategy was employed to identify small molecules that selectively target NDRG1 deficient PCa cells.
View Article and Find Full Text PDFPatients with normal karyotype represent the single largest cytogenetic group of acute myeloid leukemia (AML), with highly heterogeneous clinical and molecular characteristics. In this study, we sought to determine new prognostic biomarkers in cytogenetically normal (CN)-AML patients. A gene expression (GE)-based risk score was built, summing up the prognostic value of 22 genes whose expression is associated with a bad prognosis in a training cohort of 163 patients.
View Article and Find Full Text PDFBackground: High Throughput Sequencing (HTS) is now heavily exploited for genome (re-) sequencing, metagenomics, epigenomics, and transcriptomics and requires different, but computer intensive bioinformatic analyses. When a reference genome is available, mapping reads on it is the first step of this analysis. Read mapping programs owe their efficiency to the use of involved genome indexing data structures, like the Burrows-Wheeler transform.
View Article and Find Full Text PDFUltra high-throughput sequencing is used to analyse the transcriptome or interactome at unprecedented depth on a genome-wide scale. These techniques yield short sequence reads that are then mapped on a genome sequence to predict putatively transcribed or protein-interacting regions. We argue that factors such as background distribution, sequence errors, and read length impact on the prediction capacity of sequence census experiments.
View Article and Find Full Text PDFReticulocytes are the last stage of erythropoiesis before red blood cells (RBC). Although most of the RBC proteins have been characterized, little is known about expression profile of their mRNA during differentiation. Our goal was to initiate a genomic global solution to provide a transcriptional data base on which it will be possible to follow the erythroid differentiation and study RBC disorders.
View Article and Find Full Text PDF