Secondary structure characterization based on amino acid composition and availability in proteins.

Joji M Otaki Motosuke Tsutsumi Tomonori Gotoh Haruhiko Yamamoto

J Chem Inf Model

The BCPH Unit of Molecular Physiology, Department of Chemistry, Biology, and Marine Science, University of the Ryukyus, Nishihara, Okinawa 903-0213, Japan.

Published: April 2010

The importance of thorough analyses of the secondary structures in proteins as basic structural units cannot be overemphasized. Although recent computational methods have achieved reasonably high accuracy for predicting secondary structures from amino acid sequences, a simple and fundamental empirical approach to characterize the amino acid composition of secondary structures was performed mainly in 1970s, with a small number of analyzed structures. To extend this classical approach using a large number of analyzed structures, here we characterized the amino acid sequences of secondary structures (12 154 alpha-helix units, 4592 3(10)-helix units, 16 787 beta-strand units, and 30 811 "other" units), using the representative three-dimensional protein structure records (1641 protein chains) from the Protein Data Bank. We first examined the length and the amino acid compositions of secondary structures, including rank order differences and assignment relationships among amino acids. These compositional results were largely, but not entirely, consistent with the previous studies. In addition, we examined the frequency of 400 amino acid doublets and 8000 triplets in secondary structures based on their relative counts, termed the availability. We identified not only some triplets that were specific to a certain secondary structure but also so-called zero-count triplets, which did not occur in a given secondary structure at all, even though they were probabilistically predicted to occur several times. Taken together, the present study revealed essential features of secondary structures and suggests potential applications in the secondary structure prediction and the functional design of protein sequences.

Download full-text PDF	Source
http://dx.doi.org/10.1021/ci900452z	DOI Listing

Publication Analysis

Top Keywords

secondary structures

amino acid

secondary structure

secondary

structures

acid composition

acid sequences

number analyzed

analyzed structures

amino

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!