AI Article Synopsis

  • The study focuses on predicting domain and linker residues in protein sequences, crucial for understanding protein function and structure.
  • A novel machine-learning approach was developed using six classifiers, and the PDP-CON tool achieved high accuracy and F-measure scores while validated against CASP database proteins.
  • All related datasets and tools are accessible for noncommercial use at the provided website, facilitating further research in this area.

Article Abstract

The prediction of domain/linker residues in protein sequences is a crucial task in the functional classification of proteins, homology-based protein structure prediction, and high-throughput structural genomics. In this work, a novel consensus-based machine-learning technique was applied for residue-level prediction of the domain/linker annotations in protein sequences using ordered/disordered regions along protein chains and a set of physicochemical properties. Six different classifiers-decision tree, Gaussian naïve Bayes, linear discriminant analysis, support vector machine, random forest, and multilayer perceptron-were exhaustively explored for the residue-level prediction of domain/linker regions. The protein sequences from the curated CATH database were used for training and cross-validation experiments. Test results obtained by applying the developed PDP-CON tool to the mutually exclusive, independent proteins of the CASP-8, CASP-9, and CASP-10 databases are reported. An n-star quality consensus approach was used to combine the results yielded by different classifiers. The average PDP-CON accuracy and F-measure values for the CASP targets were found to be 0.86 and 0.91, respectively. The dataset, source code, and all supplementary materials for this work are available at https://cmaterju.org/cmaterbioinfo/ for noncommercial use.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4788683PMC
http://dx.doi.org/10.1007/s00894-016-2933-0DOI Listing

Publication Analysis

Top Keywords

prediction domain/linker
16
protein sequences
16
domain/linker residues
8
residues protein
8
consensus approach
8
residue-level prediction
8
regions protein
8
protein
6
pdp-con prediction
4
domain/linker
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!