Inferring protein from transcript abundances using convolutional neural networks.

BioData Min

Institute of Network Biology (INET), Molecular Targets and Therapies Center (MTTC), Helmholtz Munich, Neuherberg, Germany.

Published: February 2025

Background: Although transcript abundance is often used as a proxy for protein abundance, it is an unreliable predictor. As proteins execute biological functions and their expression levels influence phenotypic outcomes, we developed a convolutional neural network (CNN) to predict protein abundances from mRNA abundances, protein sequence, and mRNA sequence in Homo sapiens (H. sapiens) and the reference plant Arabidopsis thaliana (A. thaliana).

Results: After hyperparameter optimization and initial data exploration, we implemented distinct training modules for value-based and sequence-based data. By analyzing the learned weights, we revealed common and organism-specific sequence features that influence protein-to-mRNA ratios (PTRs), including known and putative sequence motifs. Adding condition-specific protein interaction information identified genes correlated with many PTRs but did not improve predictions, likely due to insufficient data. The integrated model predicted protein abundance on unseen genes with a coefficient of determination (r) of 0.30 in H. sapiens and 0.32 in A. thaliana.

Conclusions: For H. sapiens, our model improves prediction performance by nearly 50% compared to previous sequence-based approaches, and for A. thaliana it represents the first model of its kind. The model's learned motifs recapitulate known regulatory elements, supporting its utility in systems-level and hypothesis-driven research approaches related to protein regulation.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11866710PMC
http://dx.doi.org/10.1186/s13040-025-00434-zDOI Listing

Publication Analysis

Top Keywords

convolutional neural
8
protein abundance
8
protein
6
inferring protein
4
protein transcript
4
transcript abundances
4
abundances convolutional
4
neural networks
4
networks background
4
background transcript
4

Similar Publications

Head motion is a major source of image artifacts in head computed tomography (CT), degrading the image quality and impacting diagnosis. Image-domain-based motion correction is practical for routine use since it doesn't rely on hard-to-obtain CT projection data. However, existing convolutional neural network (CNN)-based methods tend to over-smooth images, particularly in cases of moderate to severe 3D motion artifacts.

View Article and Find Full Text PDF

Objective: This paper aims to address the need for real-time malaria disease detection that integrates a faster prediction model with a robust underlying network. The study first proposes a 5G network-based healthcare system and then develops an automated malaria detection model capable of providing an accurate diagnosis, particularly in areas with limited diagnostic resources.

Methods: The proposed system leverages a deep learning-based YOLOv5x algorithm to detect malaria parasites in thick and thin blood smear samples.

View Article and Find Full Text PDF

MONSTROUS: a web-based chemical-transporter interaction profiler.

Front Pharmacol

February 2025

Department of Defense Biotechnology High Performance Computing Software Applications Institute, Telemedicine and Advanced Technology Research Center, Defense Health Agency Research and Development, Medical Research and Development Command, Frederick, MD, United States.

Transporters are membrane proteins that are critical for normal cellular function and mediate the transport of endogenous and exogenous chemicals. Chemical interactions with these transporters have the potential to affect the pharmacokinetic properties of drugs. Inhibition of transporters can cause adverse drug-drug interactions and toxicity, whereas if a drug is a substrate of a transporter, it could lead to reduced therapeutic effects.

View Article and Find Full Text PDF

Integration of Hyperspectral Imaging and Deep Learning for Discrimination of Fumigated Lilies and Prediction of Quality Indicator Contents.

Foods

February 2025

Jiangxi Province Key Laboratory of Sustainable Utilization of Traditional Chinese Medicine Resources, Institute of Traditional Chinese Medicine Health Industry, China Academy of Chinese Medical Sciences, Nanchang 330115, China.

The lily, valued for its edibility and medicinal properties, is rich in essential nutrients. However, storage conditions and sulfur fumigation during processing can degrade key nutrients like polysaccharides, phenols, and sulfur dioxide. To address this, we applied a deep learning model combined with hyperspectral imaging for the rapid prediction of nutrient quality.

View Article and Find Full Text PDF

Mediation CNN (Med-CNN) Model for High-Dimensional Mediation Data.

Int J Mol Sci

February 2025

Dalla Lana School of Public Health, University of Toronto, Toronto, ON M5S 1A1, Canada.

Complex biological features such as the human microbiome and gene expressions play a crucial role in human health by mediating various biomedical processes that influence disease progression, such as immune responses and metabolic processes. Understanding these mediation roles is essential for gaining insights into disease pathogenesis and improving treatment outcomes. However, analyzing such high-dimensional mediation features presents challenges due to their inherent structural and correlations, such as the hierarchical taxonomic structures in microbial operational taxonomic units (OTUs), gene-pathway relationships, and the high dimensionality of the datasets, which complicates mediation analysis.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!