Protein annotation has long been a challenging task in computational biology. Gene Ontology (GO) has become one of the most popular frameworks to describe protein functions and their relationships. Prediction of a protein annotation with proper GO terms demands high-quality GO term representation learning, which aims to learn a low-dimensional dense vector representation with accompanying semantic meaning for each functional label, also known as embedding. However, existing GO term embedding methods, which mainly take into account ancestral co-occurrence information, have yet to capture the full topological information in the GO-directed acyclic graph (DAG). In this study, we propose a novel GO term representation learning method, PO2Vec, to utilize the partial order relationships to improve the GO term representations. Extensive evaluations show that PO2Vec achieves better outcomes than existing embedding methods in a variety of downstream biological tasks. Based on PO2Vec, we further developed a new protein function prediction method PO2GO, which demonstrates superior performance measured in multiple metrics and annotation specificity as well as few-shot prediction capability in the benchmarks. These results suggest that the high-quality representation of GO structure is critical for diverse biological tasks including computational protein annotation.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10917077 | PMC |
http://dx.doi.org/10.1093/bib/bbae077 | DOI Listing |
Methods Mol Biol
December 2024
Horticultural Crops Disease and Pest Management Research Unit, United States Department of Agriculture-Agricultural Research Service, Corvallis, OR, USA.
Pathogens have evolved effector proteins to suppress host immunity and facilitate plant infections. RxLR effectors are small, secreted effector proteins with conserved RxLR and dEER amino acid motifs at the N terminus and highly variable C termini and are commonly found in oomycete species. We provide computational approaches to annotate RxLR candidate effector genes in a genome assembly in FASTA format with an available GFF file.
View Article and Find Full Text PDFToxins (Basel)
November 2024
Facultad de Ciencias Exactas y Naturales, Pontificia Universidad Católica del Ecuador, Quito 170525, Ecuador.
Previous proteomic studies of viperid venom revealed that it is mainly composed of metalloproteinases (SVMPs), serine proteinases (SVSPs), phospholipase A2 (PLA2), and C-type lectins (CTLs). However, other proteins appear in minor amounts that affect prey and need to be identified. This study aimed to identify novel toxic proteins in the venom gland transcriptome of and , using data from NCBI.
View Article and Find Full Text PDFJ Fungi (Basel)
December 2024
Sanya Nanfan Research Institute, Hainan University, Sanya 572025, China.
A pathogen strain responsible for sweet potato stem and foliage scab disease was isolated from sweet potato stems. Through a phylogenetic analysis based on the rDNA internal transcribed spacer (ITS) region, combined with morphological methods, the isolated strain was identified as To comprehensively analyze the pathogenicity of the isolated strain from a genetic perspective, the whole-genome sequencing of HD-1 was performed using both the PacBio and Illumina platforms. The genome of HD-1 is about 26.
View Article and Find Full Text PDFJ Fungi (Basel)
December 2024
College of Agronomy, Guangxi University, Nanning 530004, China.
Carbohydrate-binding modules (CBMs) are essential virulence factors in phytopathogens, particularly the extensively studied members from the CBM50 gene family, which are known as lysin motif (LysM) effectors and which play crucial roles in plant-pathogen interactions. However, the function of CBM50 in has yet to be fully studied. In this study, we identified seven CBM50 genes from the genome through complete sequence analysis and functional annotation.
View Article and Find Full Text PDFJ Fungi (Basel)
December 2024
Hubei Key Laboratory of Natural Medicinal Chemistry and Resource Evaluation, School of Pharmacy, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China.
is an edible and medicinal macrofungus with significant biological activity and broad pharmaceutical prospects that has received increasing attention in recent years. Although it is an important resource for macrofungi, knowledge of it remains limited. In this study, we sequenced, de novo assembled, and annotated the whole genome of using a PacBio Sequel II sequencer.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!