Predicting protein phosphorylation sites in soybean using interpretable deep tabular learning network.

Brief Bioinform

Department of Electrical and Electronic Engineering, .Faculty of Engineering and Physical Sciences, Centre for Vision, Speech, and Signal Processing, University of Surrey, Guildford, UK.

Published: March 2022

Phosphorylation of proteins is one of the most significant post-translational modifications (PTMs) and plays a crucial role in plant functionality due to its impact on signaling, gene expression, enzyme kinetics, protein stability and interactions. Accurate prediction of plant phosphorylation sites (p-sites) is vital as abnormal regulation of phosphorylation usually leads to plant diseases. However, current experimental methods for PTM prediction suffers from high-computational cost and are error-prone. The present study develops machine learning-based prediction techniques, including a high-performance interpretable deep tabular learning network (TabNet) to improve the prediction of protein p-sites in soybean. Moreover, we use a hybrid feature set of sequential-based features, physicochemical properties and position-specific scoring matrices to predict serine (Ser/S), threonine (Thr/T) and tyrosine (Tyr/Y) p-sites in soybean for the first time. The experimentally verified p-sites data of soybean proteins are collected from the eukaryotic phosphorylation sites database and database post-translational modification. We then remove the redundant set of positive and negative samples by dropping protein sequences with >40% similarity. It is found that the developed techniques perform >70% in terms of accuracy. The results demonstrate that the TabNet model is the best performing classifier using hybrid features and with window size of 13, resulted in 78.96 and 77.24% sensitivity and specificity, respectively. The results indicate that the TabNet method has advantages in terms of high-performance and interpretability. The proposed technique can automatically analyze the data without any measurement errors and any human intervention. Furthermore, it can be used to predict putative protein p-sites in plants effectively. The collected dataset and source code are publicly deposited at https://github.com/Elham-khalili/Soybean-P-sites-Prediction.

Download full-text PDF

Source
http://dx.doi.org/10.1093/bib/bbac015DOI Listing

Publication Analysis

Top Keywords

phosphorylation sites
12
interpretable deep
8
deep tabular
8
tabular learning
8
learning network
8
protein p-sites
8
p-sites soybean
8
phosphorylation
5
p-sites
5
predicting protein
4

Similar Publications

Polyphosphate kinases (PPK) play crucial roles in various biological processes, including energy storage and stress responses, through their interaction with inorganic polyphosphate (polyP) and the intracellular nucleotide pool. Members of the PPK family 2 (PPK2s) catalyse polyP‑consuming phosphorylation of nucleotides. In this study, we characterised two PPK2 enzymes from Bacillus cereus (BcPPK2) and Lysinibacillus fusiformis (LfPPK2) to investigate their substrate specificity and potential for selective nucleotide synthesis.

View Article and Find Full Text PDF

Phosphoproteomic analysis of X-ray-irradiated planarians provides novel insights into the DNA damage response.

Int J Biol Macromol

January 2025

College of Life Science, Henan Normal University, Xinxiang 453007, Henan Province, PR China. Electronic address:

Phosphorylation plays a crucial role in the cellular response to radiation and cancer therapies, yet phosphoproteomics studies in planarians remain underexplored despite the organism's remarkable regenerative capacities. This study utilized advanced ion mobility mass spectrometry for 4D-label-free quantitative proteomics to identify phosphorylation sites associated with irradiation in planarians. A total of 33,284 phosphorylation sites from 15,505 phosphorylated peptides and 4710 unique phosphoproteins were identified.

View Article and Find Full Text PDF

Gram-negative bacterial pathogens inject effector proteins inside plant cells using a type III secretion system. These effectors manipulate plant cellular functions and suppress the plant immune system in order to promote bacterial proliferation. Despite the fact that bacterial effectors are exogenous threatening proteins potentially exposed to the protein degradation systems inside plant cells, effectors are relative stable and able to perform their virulence functions.

View Article and Find Full Text PDF

Ginsenosides are the most important secondary metabolites of ginseng. Ginseng has developed certain insect resistance properties during the course of evolutionary environmental adaptation. However, the mechanism underlying the insect resistance of ginseng is poorly understood.

View Article and Find Full Text PDF

Protein kinase C (PKC) signalling has been shown to be dysregulated in various cancers including acute lymphoblastic leukemia (ALL). We have previously determined that changes in the expression levels of SLC43A3-encoded equilibrative nucleobase transporter 1 (ENBT1) can significantly alter 6-mercaptopurine (6-MP) toxicity in ALL cells. 6-MP is a common drug used in ALL chemotherapy.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!