Identification and distribution of protein families in 120 completed genomes using Gene3D.

Proteins

Biomolecular Structure and Modelling Group, Department of Biochemistry, University College London, Gower Street, London.

Published: May 2005

Using a new protocol, PFscape, we undertake a systematic identification of protein families and domain architectures in 120 complete genomes. PFscape clusters sequences into protein families using a Markov clustering algorithm (Enright et al., Nucleic Acids Res 2002;30:1575-1584) followed by complete linkage clustering according to sequence identity. Within each protein family, domains are recognized using a library of hidden Markov models comprising CATH structural and Pfam functional domains. Domain architectures are then determined using DomainFinder (Pearl et al., Protein Sci 2002;11:233-244) and the protein family and domain architecture data are amalgamated in the Gene3D database (Buchan et al., Genome Res 2002;12:503-514). Using Gene3D, we have investigated protein sequence space, the extent of structural annotation, and the distribution of different domain architectures in completed genomes from all kingdoms of life. As with earlier studies by other researchers, the distribution of domain families shows power-law behavior such that the largest 2,000 domain families can be mapped to approximately 70% of nonsingleton genome sequences; the remaining sequences are assigned to much smaller families. While approximately 50% of domain annotations within a genome are assigned to 219 universal domain families, a much smaller proportion (< 10%) of protein sequences are assigned to universal protein families. This supports the mosaic theory of evolution whereby domain duplication followed by domain shuffling gives rise to novel domain architectures that can expand the protein functional repertoire of an organism. Functional data (e.g. COG/KEGG/GO) integrated within Gene3D result in a comprehensive resource that is currently being used in structure genomics initiatives and can be accessed via http://www.biochem.ucl.ac.uk/bsm/cath/Gene3D/.

Download full-text PDF

Source
http://dx.doi.org/10.1002/prot.20409DOI Listing

Publication Analysis

Top Keywords

protein families
16
domain architectures
16
domain families
12
domain
11
protein
10
families
8
completed genomes
8
protein family
8
distribution domain
8
sequences assigned
8

Similar Publications

transcription factor AP2-06B is mutated at high frequency in Southeast Asia but does not associate with drug resistance.

Front Cell Infect Microbiol

January 2025

National Health Commission Key Laboratory of Parasitic Disease Control and Prevention, Jiangsu Provincial Key Laboratory on Parasite and Vector Control Technology, Jiangsu Institute of Parasitic Diseases, Wuxi, China.

Introduction: A continuing challenge for malaria control is the ability of to develop resistance to antimalarial drugs. Members within the transcription factor family AP2 regulate the growth and development of the parasite, and are also thought to be involved in unclear aspects of drug resistance. Here we screened for single nucleotide polymorphisms (SNPs) within the AP2 family and identified 6 non-synonymous mutations within AP2-06B (PF3D7_0613800), with allele frequencies greater than 0.

View Article and Find Full Text PDF

Background: Factors leading to severe COVID-19 remain partially known. New biomarkers predicting COVID-19 severity that are also causally involved in disease pathogenesis could improve patient management and contribute to the development of innovative therapies. Autophagy, a cytosolic structure degradation pathway is involved in the maintenance of cellular homeostasis, degradation of intracellular pathogens and generation of energy for immune responses.

View Article and Find Full Text PDF

Background: Colorectal cancer (CRC) poses a substantial global health concern, exhibits inconspicuous early symptoms, and is typically diagnosed at advanced stages leading to unfavorable outcomes. The intricate tumor microenvironment plays a crucial role in CRC development and progression, where chemokines contribute significantly. These chemokines exhibit widespread expression within tumor cells, facilitating immune cell infiltration, angiogenesis, and the establishment of distant metastases.

View Article and Find Full Text PDF

Introduction: -rearrangements define a subclass of acute leukemias characterized by a distinct gene expression signature linked to the dysfunctional oncogenic fusion proteins arising from various chromosomal translocations involving the (also known as ) gene. Research on the disease pathomechanism in -rearranged acute leukemias has mainly focused on the upregulation of the stemness-related genes of the -family and their co-factor .

Results: Here we report the and fusion gene-dependent downregulation of , a TGF-β signaling axis transcription factor.

View Article and Find Full Text PDF

Role of GLCCI1 in inhibiting PI3K-induced NLRP3 inflammasome activation in asthma.

Chin Med J Pulm Crit Care Med

December 2024

Department of Respiratory Medicine, National Key Clinical Specialty, Branch of National Clinical Research Center for Respiratory Disease, Xiangya Hospital, Central South University, Changsha, Hunan 410008, China.

Background: Glucocorticoid-induced transcript 1 (GLCCI1) has been reported to be associated with the efficiency of inhaled glucocorticoids in patients with asthma. This study aimed to investigate the role of GLCCI1 in the regulation of nucleotide-binding oligomerization domain (NOD)-like receptor (NLR) family pyrin domain-containing 3 (NLRP3) by the phosphatidylinositol 3-kinase (PI3K) pathway in the pathogenesis of allergic asthma.

Methods: The expression levels of genes encoding GLCCI1, NLRP3 inflammasome components, and PI3K pathway-related indicators were detected in cells isolated from induced sputum from patients with asthma and healthy controls.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!