Background: Gene expression is a central process in all living organisms. Central questions in the field are related to the way the expression levels of genes are encoded in the transcripts and affect their evolution, and the potential to predict expression levels solely by transcript features. In this study we analyze S. cerevisiae, a model organism with the most abundant relevant cellular and genomic measurements, to evaluate the accuracy in which expression levels can be predicted by different parts of the transcript. To this end, we perform various types of regression analyses based on a total of 5323 features of the transcript. The main advantage of the proposed predictors over previous ones is related to the accurate and comprehensive definitions of the relevant transcript features, which are based on biophysical knowledge of the gene transcription and translation processes, their modeling and evolution.

Results: Cross validation analyses of our predictors demonstrate that they achieve a correlation of 0.68/0.68/0.70/0.61/0.81 with mRNA levels, ribosomal density, protein levels, proteins per mRNA molecule (PPR), and ribosomal load (RL) respectively (all p-values <10(-140)). When we consider predictors that are based exclusively on the features related to different parts of the transcript (5'UTR, ORF, 3'UTR), the correlations with protein levels were 0.27/0.71/0.25 (all p-values <10(-5)), suggesting that the information in the UTRs is redundant, and features of the ORF alone yield similar predictions to the ones obtained based on the entire transcript.

Conclusions: The reported results demonstrate that in the analyzed model organism the expression levels of a gene are encoded in the transcript. Specifically, the prediction of a large fraction of the variance of the different gene expression steps based on transcript features alone is feasible in S. cerevisiae. We report dozens of novel transcript features related to expression levels predictions, demonstrating how such analyses can aid in understanding the gene expression process and its evolution, and how such predictors can be designed for other organisms in the future.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3852043PMC
http://dx.doi.org/10.1186/1471-2105-14-S15-S1DOI Listing

Publication Analysis

Top Keywords

transcript features
12
expression levels
12
gene expression
8
transcript
5
expression
5
levels
5
features enable
4
enable accurate
4
accurate prediction
4
prediction understanding
4

Similar Publications

A Neutrophil Extracellular Traps-Related Signature Predicts Clinical Outcomes and Identifies Immune Landscape in Ovarian Cancer.

J Cell Mol Med

December 2024

Department of Gynecology, School of Medicine, Shanghai First Maternity and Infant Hospital, Tongji University, Shanghai, China.

Ovarian cancer (OvCa) is the most lethal gynaecology malignancies worldwide. Neutrophil extracellular traps (NETs), net-like protein structures produced by activated neutrophils and DNA-histone complexes, have a central role in tumours, though haven't been fully explored in OvCa. We obtained transcriptome data from TCGA-OvCa database (n = 376) as training, ICGC-OvCa database (n = 111) as validation and GTEx database (n = 180) as controls.

View Article and Find Full Text PDF

Multiomics integration and machine learning reveal prognostic programmed cell death signatures in gastric cancer.

Sci Rep

December 2024

Clinical Teaching Hospital of Medical School, Nanjing Children's Hospital, Nanjing University, Nanjing, 210008, China.

Gastric cancer (GC) is characterized by notable heterogeneity and the impact of molecular subtypes on treatment and prognosis. The role of programmed cell death (PCD) in cellular processes is critical, yet its specific function in GC is underexplored. This study applied multiomics approaches, integrating transcriptomic, epigenetic, and somatic mutation data, with consensus clustering algorithms to classify GC molecular subtypes and assess their biological and immunological features.

View Article and Find Full Text PDF

Breast cancer (BRCA) is one of the pivotal causes of female death worldwide. And the morbidity and mortality of breast cancer have increased rapidly. Immune checkpoints are important to maintain immune tolerance and are regarded as important therapeutic targets.

View Article and Find Full Text PDF

Personalized medicine aims to tailor medical treatments to individual patients, and predicting drug responses from molecular profiles using machine learning is crucial for this goal. However, the high dimensionality of the molecular profiles compared to the limited number of samples presents significant challenges. Knowledge-based feature selection methods are particularly suitable for drug response prediction, as they leverage biological insights to reduce dimensionality and improve model interpretability.

View Article and Find Full Text PDF

Mesenchymal stromal cells (MSCs) are multipotent adult stem cells which possess immunomodulatory and repair capabilities. In this study, we investigated whether MSC therapy could modulate inflammation and lung damage in the lungs of Scnn1b-transgenic mice overexpressing the β-subunit of the epithelial sodium channel (β-ENaC), a model with features of Cystic Fibrosis lung disease. Human bone marrow derived MSC cells were intravenously delivered to mice, prior to collection of bronchoalveolar lavage (BALF) and tissue.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!