Publications by authors named "Gloria Sheynkman"

Numerous factors regulate alternative splicing of human genes at a co-transcriptional level. However, how alternative splicing depends on the regulation of gene expression is poorly understood. We leveraged data from the Genotype-Tissue Expression (GTEx) project to show a significant association of gene expression and splicing for 6874 (4.

View Article and Find Full Text PDF
Article Synopsis
  • Biomedical research is increasingly integrating artificial intelligence (AI) and machine learning (ML) to tackle complex challenges, necessitating a focus on ethical and explainable AI (XAI) due to the complexities of deep learning methods.
  • The NIH's Bridge2AI program is working on creating new flagship datasets aimed at enhancing AI/ML applications in biomedicine while establishing best practices, tools, standards, and criteria for assessing the data's AI readiness, including legal and ethical considerations.
  • The article outlines foundational criteria developed by the NIH Bridge2AI Standards Working Group to ensure the scientific rigor and ethical use of AI in biomedical research, emphasizing the need for ongoing adaptation as the field evolves.
View Article and Find Full Text PDF

A major fraction of loci identified by genome-wide association studies (GWASs) mediate alternative splicing, but mechanistic interpretation is hindered by the technical limitations of short-read RNA sequencing (RNA-seq), which cannot directly link splicing events to full-length protein isoforms. Long-read RNA-seq represents a powerful tool to characterize transcript isoforms, and recently, infer protein isoform existence. Here, we present an approach that integrates information from GWASs, splicing quantitative trait loci (sQTLs), and PacBio long-read RNA-seq in a disease-relevant model to infer the effects of sQTLs on the ultimate protein isoform products they encode.

View Article and Find Full Text PDF

Alternative splicing is a major contributor of transcriptomic complexity, but the extent to which transcript isoforms are translated into stable, functional protein isoforms is unclear. Furthermore, detection of relatively scarce isoform-specific peptides is challenging, with many protein isoforms remaining uncharted due to technical limitations. Recently, a family of advanced targeted MS strategies, termed internal standard parallel reaction monitoring (IS-PRM), have demonstrated multiplexed, sensitive detection of predefined peptides of interest.

View Article and Find Full Text PDF

Purpose Of Review: The underlying molecular mechanisms that direct stem cell differentiation into fully functional, mature cells remain an area of ongoing investigation. Cell state is the product of the combinatorial effect of individual factors operating within a coordinated regulatory network. Here, we discuss the contribution of both gene regulatory and splicing regulatory networks in defining stem cell fate during differentiation and the critical role of protein isoforms in this process.

View Article and Find Full Text PDF
Article Synopsis
  • Studying protein isoforms is crucial for biomedical research, but current methods using bottom-up mass spectrometry often face challenges like noisy detection and shared peptides, making it hard to analyze individual isoforms.
  • A new statistical method is introduced to enhance protein isoform analysis by combining mass spectrometry and transcriptomics data in a Bayesian framework, addressing uncertainties in peptide detection and abundance allocation.
  • The method shows strong performance in simulations and real datasets, accurately inferring protein isoform presence, estimating their abundance, and detecting differences between protein and transcript levels; it is available as a free Bioconductor R package with usage examples.
View Article and Find Full Text PDF
Article Synopsis
  • * They generated over 427 million long-read sequences and found that longer, more accurate sequences yield better transcript detection, while increased read depth enhances quantification.
  • * The study suggests that using reference-based tools works best for well-annotated genomes and recommends incorporating extra data to better identify rare transcripts, providing a benchmark for improving transcriptome analysis techniques in the future.
View Article and Find Full Text PDF

Alternative splicing is a major contributor of transcriptomic complexity, but the extent to which transcript isoforms are translated into stable, functional protein isoforms is unclear. Furthermore, detection of relatively scarce isoform-specific peptides is challenging, with many protein isoforms remaining uncharted due to technical limitations. Recently, a family of advanced targeted MS strategies, termed internal standard parallel reaction monitoring (IS-PRM), have demonstrated multiplexed, sensitive detection of pre-defined peptides of interest.

View Article and Find Full Text PDF

Most human Transcription factors (TFs) genes encode multiple protein isoforms differing in DNA binding domains, effector domains, or other protein regions. The global extent to which this results in functional differences between isoforms remains unknown. Here, we systematically compared 693 isoforms of 246 TF genes, assessing DNA binding, protein binding, transcriptional activation, subcellular localization, and condensate formation.

View Article and Find Full Text PDF

RNA isoforms influence cell identity and function. However, a comprehensive brain isoform map was lacking. We analyze single-cell RNA isoforms across brain regions, cell subtypes, developmental time points and species.

View Article and Find Full Text PDF

Long-read RNA sequencing has shed light on transcriptomic complexity, but questions remain about the functionality of downstream protein products. We introduce Biosurfer, a computational approach for comparing protein isoforms, while systematically tracking the transcriptional, splicing, and translational variations that underlie differences in the sequences of the protein products. Using Biosurfer, we analyzed the differences in 32,799 pairs of GENCODE annotated protein isoforms, finding a majority (70%) of variable N-termini are due to the alternative transcription start sites, while only 9% arise from 5' UTR alternative splicing.

View Article and Find Full Text PDF

Immunotherapy has emerged as a crucial strategy to combat cancer by "reprogramming" a patient's own immune system. Although immunotherapy is typically reserved for patients with a high mutational burden, neoantigens produced from posttranscriptional regulation may provide an untapped reservoir of common immunogenic targets for new targeted therapies. To comprehensively define tumor-specific and likely immunogenic neoantigens from patient RNA-Seq, we developed Splicing Neo Antigen Finder (SNAF), an easy-to-use and open-source computational workflow to predict splicing-derived immunogenic MHC-bound peptides (T cell antigen) and unannotated transmembrane proteins with altered extracellular epitopes (B cell antigen).

View Article and Find Full Text PDF
Article Synopsis
  • The Long-read RNA-Seq Genome Annotation Assessment Project (LRGASP) Consortium aimed to evaluate long-read sequencing for analyzing transcripts by generating over 427 million sequences from various species.
  • The findings highlighted that longer, accurate sequences yield better transcript identification, while increased read depth enhances quantification accuracy, particularly in well-annotated genomes.
  • The study serves as a benchmark for transcriptome analysis strategies and suggests using additional data for detecting rare transcripts or employing reference-free methods.
View Article and Find Full Text PDF

Numerous factors regulate alternative splicing of human genes at a co-transcriptional level. However, how alternative splicing depends on the regulation of gene expression is poorly understood. We leveraged data from the Genotype-Tissue Expression (GTEx) project to show a significant association of gene expression and splicing for 6874 (4.

View Article and Find Full Text PDF

Connexin37-mediated regulation of cell cycle modulators and, consequently, growth arrest lack mechanistic understanding. We previously showed that arterial shear stress up-regulates Cx37 in endothelial cells and activates a Notch/Cx37/p27 signaling axis to promote G1 cell cycle arrest, and this is required to enable arterial gene expression. However, how induced expression of a gap junction protein, Cx37, up-regulates cyclin-dependent kinase inhibitor p27 to enable endothelial growth suppression and arterial specification is unclear.

View Article and Find Full Text PDF

Traditionally, disease causal mutations were thought to disrupt gene function. However, it becomes more clear that many deleterious mutations could exhibit a "gain-of-function" (GOF) behavior. Systematic investigation of such mutations has been lacking and largely overlooked.

View Article and Find Full Text PDF

Long-read RNA sequencing (lrRNA-seq) produces detailed information about full-length transcripts, including novel and sample-specific isoforms. Furthermore, there is an opportunity to call variants directly from lrRNA-seq data. However, most state-of-the-art variant callers have been developed for genomic DNA.

View Article and Find Full Text PDF

RNA isoforms influence cell identity and function. Until recently, technological limitations prevented a genome-wide appraisal of isoform influence on cell identity in various parts of the brain. Using enhanced long-read single-cell isoform sequencing, we comprehensively analyze RNA isoforms in multiple mouse brain regions, cell subtypes, and developmental timepoints from postnatal day 14 (P14) to adult (P56).

View Article and Find Full Text PDF

A major fraction of loci identified by genome-wide association studies (GWASs) lead to alterations in alternative splicing, but interpretation of how such alterations impact proteins is hindered by the technical limitations of short-read RNA-seq, which cannot directly link splicing events to full-length transcript or protein isoforms. Long-read RNA-seq represents a powerful tool to define and quantify transcript isoforms, and recently, infer protein isoform existence. Here we present a novel approach that integrates information from GWAS, splicing QTL (sQTL), and PacBio long-read RNA-seq in a disease-relevant model to infer the effects of sQTLs on the ultimate protein isoform products they encode.

View Article and Find Full Text PDF

Alzheimer's disease (AD) is one of the most challenging neurodegenerative diseases because of its complicated and progressive mechanisms, and multiple risk factors. Increasing research evidence demonstrates that genetics may be a key factor responsible for the occurrence of the disease. Although previous reports identified quite a few AD-associated genes, they were mostly limited owing to patient sample size and selection bias.

View Article and Find Full Text PDF

Background: Coronary artery disease (CAD) is the leading cause of death worldwide. Recent meta-analyses of genome-wide association studies have identified over 175 loci associated with CAD. The majority of these loci are in noncoding regions and are predicted to regulate gene expression.

View Article and Find Full Text PDF
Article Synopsis
  • Endothelial cells play a crucial role in the cardiovascular system, and their properties can be influenced by RNA splicing, resulting in different protein isoforms.
  • A long read proteogenomics approach was used to analyze human umbilical vein endothelial cells (HUVECs), revealing 53,863 transcript isoforms from 10,426 genes, with many being novel.
  • The study identified a significant number of novel protein isoforms formed from various RNA splicing mechanisms, suggesting important implications for understanding endothelial cell functions and signaling pathways.
View Article and Find Full Text PDF
Article Synopsis
  • Endothelial cells differentiate into arterial or venous types during blood vessel development, crucial for nutrient and waste transport in tissues.
  • The study uses specific mouse models to reveal that venous endothelial cells are primarily in an early G1 state with BMP signaling, while arterial cells are in a late G1 state with TGF-β signaling.
  • They found that these cell cycle stages are critical for the expression of venous and arterial genes, and that preventing cell cycle progression can fix defects in arterial-venous specification.
View Article and Find Full Text PDF

Over the last 25 years, biology has entered the genomic era and is becoming a science of 'big data'. Most interpretations of genomic analyses rely on accurate functional annotations of the proteins encoded by more than 500 000 genomes sequenced to date. By different estimates, only half the predicted sequenced proteins carry an accurate functional annotation, and this percentage varies drastically between different organismal lineages.

View Article and Find Full Text PDF
Article Synopsis
  • * Current methods using short-read RNA sequencing struggle to map full-length transcripts accurately, making it hard to connect genetic variations to specific protein changes associated with diseases.
  • * Long-read sequencing technologies are emerging as a solution, enabling researchers to better identify disease-related transcripts and their protein isoforms, thus enhancing our understanding of complex diseases and opening new avenues for potential treatments.
View Article and Find Full Text PDF