The current diagnostic rate for patients with suspected Mendelian genetic disorders is only 25 to 58%, even though whole exome sequencing (WES) is part of the standard of care. One reason for the low diagnostic rate is that traditional WES analysis methods struggle to detect RNA splicing aberrations. It is estimated that 15-50% of human pathogenic variants alter splicing, with numerous splice-altering variants being causal for known Mendelian disorders.
View Article and Find Full Text PDFAlternative splicing (AS) of pre-mRNA plays a crucial role in tissue-specific gene regulation, with disease implications due to splicing defects. Predicting and manipulating AS can therefore uncover new regulatory mechanisms and aid in therapeutics design. We introduce TrASPr+BOS, a generative AI model with Bayesian Optimization for predicting and designing RNA for tissue-specific splicing outcomes.
View Article and Find Full Text PDFNucleic Acids Res
January 2025
RNA sequencing (RNA-seq) is widely adopted for transcriptome analysis but has inherent biases that hinder the comprehensive detection and quantification of alternative splicing. To address this, we present an efficient targeted RNA-seq method that greatly enriches for splicing-informative junction-spanning reads. Local splicing variation sequencing (LSV-seq) utilizes multiplexed reverse transcription from highly scalable pools of primers anchored near splicing events of interest.
View Article and Find Full Text PDFRNA-sequencing (RNA-seq) is widely adopted for transcriptome analysis but has inherent biases which hinder the comprehensive detection and quantification of alternative splicing. To address this, we present an efficient targeted RNA-seq method that greatly enriches for splicing-informative junction-spanning reads. Local Splicing Variation sequencing (LSV-seq) utilizes multiplexed reverse transcription from highly scalable pools of primers anchored near splicing events of interest.
View Article and Find Full Text PDFMapping transcriptomic variations using either short- or long-read RNA sequencing is a staple of genomic research. Long reads are able to capture entire isoforms and overcome repetitive regions, whereas short reads still provide improved coverage and error rates. Yet, open questions remain, such as how to quantitatively compare the technologies, can we combine them, and what is the benefit of such a combined view? We tackle these questions by first creating a pipeline to assess matched long- and short-read data using a variety of transcriptome statistics.
View Article and Find Full Text PDFMapping transcriptomic variations using either short or long reads RNA sequencing is a staple of genomic research. Long reads are able to capture entire isoforms and overcome repetitive regions, while short reads still provides improved coverage and error rates. Yet how to quantitatively compare the technologies, can we combine those, and what may be the benefit of such a combined view remain open questions.
View Article and Find Full Text PDFQuantification of RNA splicing variations based on RNA-Sequencing can reveal tissue- and disease-specific splicing patterns. To study such splicing variations, we introduce MAJIQlopedia, an encyclopedia of splicing variations that encompasses 86 human tissues and 41 cancer datasets. MAJIQlopedia reports annotated and unannotated splicing events for a total of 486 175 alternative splice junctions in normal tissues and 338 317 alternative splice junctions in cancer.
View Article and Find Full Text PDFThe ubiquity of RNA-seq has led to many methods that use RNA-seq data to analyze variations in RNA splicing. However, available methods are not well suited for handling heterogeneous and large datasets. Such datasets scale to thousands of samples across dozens of experimental conditions, exhibit increased variability compared to biological replicates, and involve thousands of unannotated splice variants resulting in increased transcriptome complexity.
View Article and Find Full Text PDFIdentification of cancer sub-types is a pivotal step for developing personalized treatment. Specifically, sub-typing based on changes in RNA splicing has been motivated by several recent studies. We thus develop CHESSBOARD, an unsupervised algorithm tailored for RNA splicing data that captures "tiles" in the data, defined by a subset of unique splicing changes in a subset of patients.
View Article and Find Full Text PDF