Analyzing microbial samples remains computationally challenging due to their diversity and complexity. The lack of robust de novo protein function prediction methods exacerbates the difficulty in deriving functional insights from these samples. Traditional prediction methods, dependent on homology and sequence similarity, often fail to predict functions for novel proteins and proteins without known homologs. Moreover, most of these methods have been trained on largely eukaryotic data, and have not been evaluated on or applied to microbial datasets. This research introduces DeepGOMeta, a deep learning model designed for protein function prediction as Gene Ontology (GO) terms, trained on a dataset relevant to microbes. The model is applied to diverse microbial datasets to demonstrate its use for gaining biological insights. Data and code are available at https://github.com/bio-ontology-research-group/deepgometa.

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-024-82956-wDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11685674PMC

Publication Analysis

Top Keywords

protein function
12
function prediction
12
functional insights
8
prediction methods
8
microbial datasets
8
deepgometa functional
4
microbial
4
insights microbial
4
microbial communities
4
communities deep
4

Similar Publications

Cigarette smoking is a well-known risk factor inducing the development and progression of various diseases. Nicotine (NIC) is the major constituent of cigarette smoke. However, knowledge of the mechanism underlying the NIC-regulated stem cell functions is limited.

View Article and Find Full Text PDF

: RAS guanyl-releasing protein 1 (RASGRP1) deficiency is characterized by immune dysregulation and Epstein-Barr virus (EBV)-related lymphoproliferation. Diffuse mesangial sclerosis is one of the infrequent causes of infantile nephrotic syndrome. : Here, we described a 7-year-old girl who was diagnosed with diffuse mesangial sclerosis at 5 months old and subsequently developed chronic bilateral neck swelling at the age of 3 years.

View Article and Find Full Text PDF

Anaplastic Thyroid Cancer (ATC) is an aggressive form of cancer with poor prognosis, heavily influenced by its tumor immune microenvironment (TIME). Understanding the cellular and gene expression dynamics within the TIME is crucial for developing targeted therapies. This study analyzes the immune microenvironment of ATC and Papillary Thyroid Cancer (PTC) using single-cell RNA sequencing (scRNA-seq).

View Article and Find Full Text PDF

The major limiting factor of photosynthesis in C3 plants is the enzyme, rubisco which inadequately distinguishes between carbon dioxide and oxygen. To overcome catalytic deficiencies of Rubisco, cyanobacteria utilize advanced protein microcompartments, called the carboxysomes which envelopes the enzymes, Rubisco and Carbonic Anhydrase (CA). These microcompartments facilitate the diffusion of bicarbonate ions which are converted to CO by CA, following in an increase in carbon flux near Rubisco boosting CO fixation process.

View Article and Find Full Text PDF

Alzheimer's disease (AD) is a central nervous system degenerative disease with a stealthy onset and a progressive course characterized by memory loss, cognitive dysfunction, and abnormal psychological and behavioral symptoms. However, the pathogenesis of AD remains elusive. An increasing number of studies have shown that oligodendrocyte progenitor cells (OPCs) and oligodendroglial lineage cells (OLGs), especially OPCs and mature oligodendrocytes (OLGs), which are derived from OPCs, play important roles in the pathogenesis of AD.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!