Availability of short amino acid sequences in proteins.

Protein Sci

Department of Biological Sciences, Kanagawa University, 2946 Tsuchiya, Hiratsuka, Kanagawa 259-1293, Japan.

Published: March 2005

Much attention is being paid to protein databases as an important information source for proteome research. Although used extensively for similarity searches, protein databases themselves have not fully been characterized. In a systematic attempt to reveal protein-database characters that could contribute to revealing how protein chains are constructed, frequency distributions of all possible combinatorial sets of three, four, and five amino acids ("triplets," "quartets," and "pentats"; collectively called constituent sequences) have been examined in the nonredundant (nr) protein database, demonstrating the existence of nonrandom bias in their "availability" at the population level. Nonexistent short sequences of pentats were found that showed low availability in biological proteins against their expected probabilities of occurrence. Among them, six representative ones were successfully synthesized as peptides with reasonably high yields in a conventional Fmoc method, excluding the possibility that a putative physicochemical energy barrier in forming them could be a direct cause for the low availability. They were also expressed as soluble fusion proteins in a conventional Escherichia coli BL21Star(DE3) system with reasonably high yield, again excluding a possible difficulty in their biological synthesis. Together, these results suggest that information on three-dimensional structures and functions of proteins exists in the context of connections of short constituent sequences, and that proteins are composed of evolutionarily selected constituent sequences, which are reflected in their availability differences in the database. These results may have biological implications for protein structural studies.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2279279PMC
http://dx.doi.org/10.1110/ps.041092605DOI Listing

Publication Analysis

Top Keywords

constituent sequences
12
sequences proteins
8
protein databases
8
low availability
8
reasonably high
8
sequences
5
proteins
5
protein
5
availability
4
availability short
4

Similar Publications

Preparation of pH-Responsive Tanshinone IIA-Loaded Calcium Alginate Nanoparticles and Their Anticancer Mechanisms.

Pharmaceutics

January 2025

State Key Laboratory for Macromolecule Drugs and Large-Scale Manufacturing, College of Pharmacy, Wenzhou Medical University, Wenzhou 325035, China.

Tanshinone IIA (Tan IIA) is a lipophilic active constituent derived from the rhizomes and roots of (Danshen), a common Chinese medicinal herb. However, clinical applications of Tan IIA are limited due to its poor solubility in water. : To overcome this limitation, we developed a calcium alginate hydrogel (CA) as a hydrophilic carrier for Tan IIA, which significantly improved its solubility.

View Article and Find Full Text PDF

Complex motor skills involve intricate sequences of movements that require precise temporal coordination across multiple body parts, posing challenges to mastery based on perceived error or reward. One approach that has been widely used is to decompose such skills into simpler, constituent movement elements during the learning process, thereby aligning the task complexity with the learners' capacity for accurate execution. Despite common belief and prevalent adoption, the effectiveness of this method remains elusive.

View Article and Find Full Text PDF

Impact of IL-32 Gene Polymorphisms on Tuberculosis Susceptibility in a Chinese Han population.

Microb Pathog

January 2025

Department of Clinical Laboratory Medicine, Shanghai Pulmonary Hospital, Tongji University School of Medicine, Shanghai 200433, China. Electronic address:

Objective: Interleukin (IL)-32, encoded by the IL-32 gene, is a crucial constituent of the autophagy pathway and is involved in the regulation of Mycobacterium tuberculosis (M.tb) infection, a major global health challenge. This study aimed to examine the potential association between IL-32 polymorphisms and susceptibility to Tuberculosis(TB), highlighting the significance of genetic factors in TB risk.

View Article and Find Full Text PDF

Ethnopharmacological Relevance: Glycyrrhiza, a legume native to the Mediterranean region, has a long history of ethnomedicinal use in China. Due to its antiviral, antibacterial, anti-inflammatory, antioxidant, antitumor, anti-ulcer, and hepatoprotective properties, Glycyrrhiza is widely utilized in the treatment of gastrointestinal disorders.

The Aim Of The Review: The specific mechanisms of the main active constituents of glycyrrhiza in the treatment of inflammatory bowel disease, precancerous lesions and colorectal cancer at all stages of the colitis-associated colon cancer "Inflammation-Dysplasia-Cancer" sequence, as well as its pharmacokinetics, toxicology, formulation improvements, and application studies, are reviewed to provide new insights and perspectives on glycyrrhiza as a dietary supplement to treat and prevent colitis-associated colon cancer.

View Article and Find Full Text PDF

Integrated omics analyses elucidate acetaminophen biodegradation by Enterobacter sp. APAP_BS8.

J Environ Manage

January 2025

Department of Botany, Institute of Science, Banaras Hindu University, Varanasi, 221005, India. Electronic address:

Acetaminophen (APAP) is an extensively consumed over-the-counter and prescribed medication and a constituent of many active pharmaceutical compounds as well as personal care products. Its wide-scale prevalence in the environment due to inefficient treatment technologies has classified APAP as a contaminant of emerging concern. Thus, it is imperative to explore efficient and sustainable methods for remediation of contaminated environments.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!