A fast SCOP fold classification system using content-based E-Predict algorithm.

BMC Bioinformatics

Medical and Biological Digital Library Research Lab, Department of Computer Science, University of Missouri, Columbia, MO 65211, USA.

Published: July 2006

Background: Domain experts manually construct the Structural Classification of Protein (SCOP) database to categorize and compare protein structures. Even though using the SCOP database is believed to be more reliable than classification results from other methods, it is labor intensive. To mimic human classification processes, we develop an automatic SCOP fold classification system to assign possible known SCOP folds and recognize novel folds for newly-discovered proteins.

Results: With a sufficient amount of ground truth data, our system is able to assign the known folds for newly-discovered proteins in the latest SCOP v1.69 release with 92.17% accuracy. Our system also recognizes the novel folds with 89.27% accuracy using 10 fold cross validation. The average response time for proteins with 500 and 1409 amino acids to complete the classification process is 4.1 and 17.4 seconds, respectively. By comparison with several structural alignment algorithms, our approach outperforms previous methods on both the classification accuracy and efficiency.

Conclusion: In this paper, we build an advanced, non-parametric classifier to accelerate the manual classification processes of SCOP. With satisfactory ground truth data from the SCOP database, our approach identifies relevant domain knowledge and yields reasonably accurate classifications. Our system is publicly accessible at http://ProteinDBS.rnet.missouri.edu/E-Predict.php.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1579235PMC
http://dx.doi.org/10.1186/1471-2105-7-362DOI Listing

Publication Analysis

Top Keywords

scop database
12
scop fold
8
classification
8
fold classification
8
classification system
8
classification processes
8
system assign
8
novel folds
8
folds newly-discovered
8
ground truth
8

Similar Publications

Betanin, a natural compound with anti-inflammatory and antioxidant properties, has shown promise in mitigating Alzheimer's disease (AD) by reducing amyloid plaque production. Employing network pharmacology, this study aimed to elucidate betanin's therapeutic mechanism in AD treatment. Through integrated analyses utilizing SwissTargetPrediction, STITCH, BindingDB, Therapeutic Target Database (TTD), and OMIM databases, potential protein targets of betanin in AD were predicted.

View Article and Find Full Text PDF

The genus Bistorta comprises about 43 accepted species that are widely used by local people and medicinal practitioners for the treatment of rheumatism, tuberculosis, inflammation, respiratory infection, and other diseases. The objective of this review is to present up-to-date information from the scientific literature about the phytochemistry, pharmacology, and toxicology of Bistorta. At present, there is a lack of a comprehensive review that consolidates the various scientific studies conducted on the genus Bistorta.

View Article and Find Full Text PDF

In addition to the growth of protein structures generated through wet laboratory experiments and deposited in the PDB repository, AlphaFold predictions have significantly contributed to the creation of a much larger database of protein structures. Annotating such a vast number of structures has become an increasingly challenging task. CATH is widely recognized as one the most common platforms for addressing this challenge, as it classifies proteins based on their structural and evolutionary relationships, offering the scientific community an invaluable resource for uncovering various properties, including functional annotations.

View Article and Find Full Text PDF

Bioinformatics approach for prediction and analysis of the Non-Structural Protein 4B (NSP4B) of the Zika virus.

J Genet Eng Biotechnol

March 2024

Physics Department, Medical Biophysics Division, Faculty of Science, Helwan University, Cairo, Egypt.

Background: The Nonstructural Protein (NSP) 4B of Zika virus of 251 amino acids from (ZIKV/Human/POLG_ZIKVF) with accession number (A0A024B7W1), Induces the production of Endoplasmic Reticulum ER-derived membrane vesicles, which are the sites of viral replication. To understand the physical basis of how proteins fold in nature and to solve the challenge of protein structure prediction, Ab-initio and comparative modeling are crucial tools.

Results: The systematic in silico technique, ThreaDom, had only predicted one domain (4 - 190) of NSP4B.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!