Sequence representation and prediction of protein secondary structure for structural motifs in twilight zone proteins.

Protein J

Electrical and Computer Engineering Department, University of Alberta, Edmonton, Alberta, Canada, T6G 2V4.

Published: December 2006

Characterizing and classifying regularities in protein structure is an important element in uncovering the mechanisms that regulate protein structure, function and evolution. Recent research concentrates on analysis of structural motifs that can be used to describe larger, fold-sized structures based on homologous primary sequences. At the same time, accuracy of secondary protein structure prediction based on multiple sequence alignment drops significantly when low homology (twilight zone) sequences are considered. To this end, this paper addresses a problem of providing an alternative sequences representation that would improve ability to distinguish secondary structures for the twilight zone sequences without using alignment. We consider a novel classification problem, in which, structural motifs, referred to as structural fragments (SFs) are defined as uniform strand, helix and coil fragments. Classification of SFs allows to design novel sequence representations, and to investigate which other factors and prediction algorithms may result in the improved discrimination. Comprehensive experimental results show that statistically significant improvement in classification accuracy can be achieved by: (1) improving sequence representations, and (2) removing possible noise on the terminal residues in the SFs. Combining these two approaches reduces the error rate on average by 15% when compared to classification using standard representation and noisy information on the terminal residues, bringing the classification accuracy to over 70%. Finally, we show that certain prediction algorithms, such as neural networks and boosted decision trees, are superior to other algorithms.

Download full-text PDF

Source
http://dx.doi.org/10.1007/s10930-006-9029-0DOI Listing

Publication Analysis

Top Keywords

structural motifs
12
twilight zone
12
protein structure
12
zone sequences
8
sequence representations
8
prediction algorithms
8
classification accuracy
8
terminal residues
8
classification
5
sequence
4

Similar Publications

The wall-associated kinase (WAK) gene family encodes functional cell wall-related proteins. These genes are widely presented in plants and serve as the receptors of plant cell membranes, which perceive the external environment changes and activate signaling pathways to participate in plant growth, development, defense, and stress response. However, the WAK gene family and the encoded proteins in soybean (Glycine max (L.

View Article and Find Full Text PDF

The methylation- demethylation dynamics of RNA plays major roles in different biological functions, including stress responses, in plants. mA methylation in RNA is orchestrated by a coordinated function of methyl transferases (writers) and demethylases (Erasers). Genome-wide analysis of genes involved in methylation and demethylation was performed in pigeon pea.

View Article and Find Full Text PDF

Polyketide synthases (PKSs) are multidomain enzymatic assembly lines that biosynthesize a wide selection of bioactive natural products from simple building blocks. In contrast to their -acyltransferase (AT) counterparts, -AT PKSs rely on stand-alone ATs to load extender units onto acyl carrier protein (ACP) domains embedded in the core PKS machinery. -AT PKS gene clusters also encode stand-alone acyl hydrolases (AHs), which are predicted to share the overall fold of ATs but function like type II thioesterases (TEs), hydrolyzing aberrant acyl chains from ACP domains to promote biosynthetic efficiency.

View Article and Find Full Text PDF

For decades, extensive surfactant libraries have been developed to meet the requirements of downstream applications. However, achieving functional diversity has traditionally demanded a vast array of chemical motifs and synthetic pathways. Herein, a new approach for surfactant design based on structural isomerism is utilised to access a wide spectrum of functionalities.

View Article and Find Full Text PDF

The quest for color-pure emitters for multicolor bioimaging as well as for ultrahigh definition organic light-emitting diodes demands facile design concepts to avoid tedious synthetic or computational trial-and-error procedures. We have recently presented a simple recipe to construct color-pure blue emitters, which combines basic resonance structure and frontier molecular orbital treatments; this recipe applies to multiresonant type emitters and allows to enlarge the chemical space toward novel structural motifs. In the current work, we show that such fundamental considerations further apply to the structurally entirely different family of xanthene dyes.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!