Semi-supervised protein subcellular localization.

BMC Bioinformatics

Program of Bioengineering, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong.

Published: January 2009

Background: Protein subcellular localization is concerned with predicting the location of a protein within a cell using computational method. The location information can indicate key functionalities of proteins. Accurate predictions of subcellular localizations of protein can aid the prediction of protein function and genome annotation, as well as the identification of drug targets. Computational methods based on machine learning, such as support vector machine approaches, have already been widely used in the prediction of protein subcellular localization. However, a major drawback of these machine learning-based approaches is that a large amount of data should be labeled in order to let the prediction system learn a classifier of good generalization ability. However, in real world cases, it is laborious, expensive and time-consuming to experimentally determine the subcellular localization of a protein and prepare instances of labeled data.

Results: In this paper, we present an approach based on a new learning framework, semi-supervised learning, which can use much fewer labeled instances to construct a high quality prediction model. We construct an initial classifier using a small set of labeled examples first, and then use unlabeled instances to refine the classifier for future predictions.

Conclusion: Experimental results show that our methods can effectively reduce the workload for labeling data using the unlabeled data. Our method is shown to enhance the state-of-the-art prediction results of SVM classifiers by more than 10%.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2648770PMC
http://dx.doi.org/10.1186/1471-2105-10-S1-S47DOI Listing

Publication Analysis

Top Keywords

subcellular localization
16
protein subcellular
12
prediction protein
8
protein
6
subcellular
5
prediction
5
semi-supervised protein
4
localization
4
localization background
4
background protein
4

Similar Publications

Inovirus-Encoded Peptides Induce Specific Toxicity in .

Viruses

January 2025

Key Laboratory of Tropical Marine Bio-resources and Ecology, South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou 511458, China.

is a common opportunistic pathogen associated with nosocomial infections. The primary treatment for infections typically involves antibiotics, which can lead to the emergence of multidrug-resistant strains. Therefore, there is a pressing need for safe and effective alternative methods.

View Article and Find Full Text PDF

Bombyx mori bidensovirus (BmBDV), a significant pathogen in the sericulture industry, holds a unique taxonomic position due to its distinct segmented single-stranded DNA (ssDNA) genome and the presence of a self-encoding DNA polymerase. However, the functions of viral non-structural proteins, such as NS2, remain unknown. This protein is hypothesized to play a role in viral replication and pathogenesis.

View Article and Find Full Text PDF

Arginase (ARG) is a binuclear manganese-containing metalloenzyme that can convert L-arginine to L-ornithine and urea and plays a key role in the urea cycle. It also mediates different cellular functions and processes such as proliferation, senescence, apoptosis, autophagy, and inflammatory responses in various cell types. In mammals, there are two isoenzymes, ARG-1 and ARG-2; they are functionally similar, but their coding genes, tissue distribution, subcellular localization, and molecular regulation are distinct.

View Article and Find Full Text PDF

The trichomes of mustard leaves have significance due to their ability to combat unfavorable external conditions and enhance disease resistance. It was demonstrated that the MYB-bHLH-WD40 (MBW) ternary complex consists of MYB, basic Helix-Loop-Helix (bHLH), and WD40-repeat (WD40) family proteins and plays a key role in regulating trichome formation and density. The bHLH gene family, particularly the Myelocytomatosis (MYC) proteins that possess the structural bHLH domain (termed bHLH-MYC), are crucial to the formation and development of leaf trichomes in plants.

View Article and Find Full Text PDF

Analysis of the CHS Gene Family Reveals Its Functional Responses to Hormones, Salinity, and Drought Stress in Moso Bamboo ().

Plants (Basel)

January 2025

State Key Laboratory of Tree Genetics and Breeding, Co-Innovation Center for Sustainable Forestry in Southern China, Bamboo Research Institute, Key Laboratory of National Forestry and Grassland Administration on Subtropical Forest Biodiversity Conservation, School of Life Sciences, Nanjing Forestry University, Nanjing 210037, China.

Chalcone synthase (CHS), the first key structural enzyme in the flavonoid biosynthesis pathway, plays a crucial role in regulating plant responses to abiotic stresses and hormone signaling. However, its molecular functions remain largely unknown in , which is one of the most economically and ecologically important bamboo species and the most widely distributed one in China. This study identified 17 genes in and classified them into seven subgroups, showing a closer evolutionary relationship to genes from rice.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!