Functional annotation of protein sequences with low similarity to well characterized protein sequences is a major challenge of computational biology in the post genomic era. The cyclin protein family is once such important family of proteins which consists of sequences with low sequence similarity making discovery of novel cyclins and establishing orthologous relationships amongst the cyclins, a difficult task. The currently identified cyclin motifs and cyclin associated domains do not represent all of the identified and characterized cyclin sequences. We describe a Support Vector Machine (SVM) based classifier, CyclinPred, which can predict cyclin sequences with high efficiency. The SVM classifier was trained with features of selected cyclin and non cyclin protein sequences. The training features of the protein sequences include amino acid composition, dipeptide composition, secondary structure composition and PSI-BLAST generated Position Specific Scoring Matrix (PSSM) profiles. Results obtained from Leave-One-Out cross validation or jackknife test, self consistency and holdout tests prove that the SVM classifier trained with features of PSSM profile was more accurate than the classifiers based on either of the other features alone or hybrids of these features. A cyclin prediction server--CyclinPred has been setup based on SVM model trained with PSSM profiles. CyclinPred prediction results prove that the method may be used as a cyclin prediction tool, complementing conventional cyclin prediction methods.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2435623PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0002605PLOS

Publication Analysis

Top Keywords

protein sequences
20
cyclin protein
12
cyclin prediction
12
cyclin
11
sequences
8
sequences low
8
cyclin sequences
8
svm classifier
8
classifier trained
8
trained features
8

Similar Publications

Motivation: Predicting RNA-binding proteins (RBPs) is central to understanding post-transcriptional regulatory mechanisms. Here, we introduce EnrichRBP, an automated and interpretable computational platform specifically designed for the comprehensive analysis of RBP interactions with RNA.

Results: EnrichRBP is a web service that enables researchers to develop original deep learning and machine learning architectures to explore the complex dynamics of RNA-binding proteins.

View Article and Find Full Text PDF

An Efficient and Cost-Effective Novel Strategy for Identifying CRISPR-Cas-Mediated Mutants in Plant Offspring.

CRISPR J

January 2025

Plant Biotechnology Research Center, Fudan-SJTU-Nottingham Plant Biotechnology R&D Center, School of Agriculture and Biology, Shanghai Jiao Tong University, Minhang, Shanghai, China.

The clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein 9 system has revolutionized targeted mutagenesis, but screening for mutations in large sample pools can be time-consuming and costly. We present an efficient and cost-effective polymerase chain reaction (PCR)-based strategy for identifying edited mutants in the T generation. Unlike previous methods, our approach addresses the challenges of large progeny populations by using T generation sequencing results for genotype prediction.

View Article and Find Full Text PDF

Optically pure monosubstituted [n]paracyclophanes are promising candidates for material synthesis, asymmetric catalysis, and drug discovery. Thus far, only a few catalytic asymmetric synthesis processes have been reported for assessing these stained atropisomers. In this study, we describe a highly enantioselective synthesis of monosubstituted [n]paracyclophanes by combining desymmetrization and kinetic resolution.

View Article and Find Full Text PDF

Proteins have proven to be useful agents in a variety of fields, from serving as potent therapeutics to enabling complex catalysis for chemical manufacture. However, they remain difficult to design and are instead typically selected for using extensive screens or directed evolution. Recent developments in protein large language models have enabled fast generation of diverse protein sequences in unexplored regions of protein space predicted to fold into varied structures, bind relevant targets, and catalyze novel reactions.

View Article and Find Full Text PDF

AlphaFold2 (AF2), a deep-learning based model that predicts protein structures from their amino acid sequences, has recently been used to predict multiple protein conformations. In some cases, AF2 has successfully predicted both dominant and alternative conformations of fold-switching proteins, which remodel their secondary and tertiary structures in response to cellular stimuli. Whether AF2 has learned enough protein folding principles to reliably predict alternative conformations outside of its training set is unclear.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!