deep-Sep: a deep learning-based method for fast and accurate prediction of selenoprotein genes in bacteria.

mSystems

Shenzhen Key Laboratory of Marine Bioresources and Ecology, Brain Disease and Big Data Research Institute, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, Guangdong, China.

Published: March 2025

Selenoproteins are a special group of proteins with major roles in cellular antioxidant defense. They contain the 21st amino acid selenocysteine (Sec) in the active sites, which is encoded by an in-frame UGA codon. Compared to eukaryotes, identification of selenoprotein genes in bacteria remains challenging due to the absence of an effective strategy for distinguishing the Sec-encoding UGA codon from a normal stop signal. In this study, we have developed a deep learning-based algorithm, deep-Sep, for quickly and precisely identifying selenoprotein genes in bacterial genomic sequences. This algorithm uses a Transformer-based neural network architecture to construct an optimal model for detecting Sec-encoding UGA codons and a homology search-based strategy to remove additional false positives. During the training and testing stages, deep-Sep has demonstrated commendable performance, including an score of 0.939 and an area under the receiver operating characteristic curve of 0.987. Furthermore, when applied to 20 bacterial genomes as independent test data sets, deep-Sep exhibited remarkable capability in identifying both known and new selenoprotein genes, which significantly outperforms the existing state-of-the-art method. Our algorithm has proved to be a powerful tool for comprehensively characterizing selenoprotein genes in bacterial genomes, which should not only assist in accurate annotation of selenoprotein genes in genome sequencing projects but also provide new insights for a deeper understanding of the roles of selenium in bacteria.IMPORTANCESelenium is an essential micronutrient present in selenoproteins in the form of Sec, which is a rare amino acid encoded by the opal stop codon UGA. Identification of all selenoproteins is of vital importance for investigating the functions of selenium in nature. Previous strategies for predicting selenoprotein genes mainly relied on the identification of a special -acting Sec insertion sequence (SECIS) element within mRNAs. However, due to the complexity and variability of SECIS elements, recognition of all selenoprotein genes in bacteria is still a major challenge in the annotation of bacterial genomes. We have developed a deep learning-based algorithm to predict selenoprotein genes in bacterial genomic sequences, which demonstrates superior performance compared to currently available methods. This algorithm can be utilized in either web-based or local (standalone) modes, serving as a promising tool for identifying the complete set of selenoprotein genes in bacteria.

Download full-text PDF

Source
http://dx.doi.org/10.1128/msystems.01258-24DOI Listing

Publication Analysis

Top Keywords

selenoprotein genes
40
genes bacteria
16
deep learning-based
12
genes bacterial
12
bacterial genomes
12
selenoprotein
10
genes
10
amino acid
8
uga codon
8
sec-encoding uga
8

Similar Publications

Vaccarin Ameliorates Renal Fibrosis by Inhibiting Ferroptosis via Nrf2/SLC7A11/GPX4 Signaling Pathway.

Drug Des Devel Ther

March 2025

Department of Nephropathy, The Seventh People's Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, 200137, People's Republic of China.

Purpose: Vaccarin is a natural flavonoid glycoside with anti-inflammatory, antioxidant and nephroprotective effects. However, the effects of vaccarin on renal fibrosis (RF) and its molecular mechanisms remain unclear. This study aimed to investigate the effects of vaccarin on RF and its molecular mechanisms.

View Article and Find Full Text PDF

deep-Sep: a deep learning-based method for fast and accurate prediction of selenoprotein genes in bacteria.

mSystems

March 2025

Shenzhen Key Laboratory of Marine Bioresources and Ecology, Brain Disease and Big Data Research Institute, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, Guangdong, China.

Selenoproteins are a special group of proteins with major roles in cellular antioxidant defense. They contain the 21st amino acid selenocysteine (Sec) in the active sites, which is encoded by an in-frame UGA codon. Compared to eukaryotes, identification of selenoprotein genes in bacteria remains challenging due to the absence of an effective strategy for distinguishing the Sec-encoding UGA codon from a normal stop signal.

View Article and Find Full Text PDF

Objective: Mechanism underlying the etiology of polycystic ovary syndrome (PCOS) is still debatable. Present study explores the link between iron-mediated ferroptosis and PCOS.

Methodology: Blood samples were collected from 150 PCOS females along with healthy controls.

View Article and Find Full Text PDF

Background: The relationship between the brain and fatigue is gaining increasing attention, with numerous studies indicating that certain specific brain regions may be closely linked to fatigue. Our study aimed to identify brain regions exhibiting significant causal relationships to fatigue and discover potential neurotherapeutic targets associated with fatigue, in the pursuit of seeking new approaches for fatigue treatment.

Methods: A bidirectional two-sample Mendelian randomization (TSMR) method was employed to investigate causal relationships between cortical and subcortical gray matter volumes in 83 regions and fatigue.

View Article and Find Full Text PDF

Tolerance and antioxidant response to heavy metals are differentially activated in and .

PeerJ

February 2025

Departamento de Biotecnologia, Instituto Politécnico Nacional Centro de Desarrollo de Productos Bióticos, Yautepec, Morelos, Mexico.

Heavy metal pollution reduces the community of soil microorganisms, including fungi from the genus , which are plant growth promotors and biological control agents. Because of potential effects on crop productivity, the toxic effects of heavy metals (HMs) in are of interest. However, there have been few studies on the biochemical and molecular response to oxidation caused by exposure to copper (Cu), chromium (Cr), and lead (Pb) and whether this antioxidant response is species-specific.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!