Background: Proteins are comprised of one or several building blocks, known as domains. Such domains can be classified into families according to their evolutionary origin. Whereas sequencing technologies have advanced immensely in recent years, there are no matching computational methodologies for large-scale determination of protein domains and their boundaries. We provide and rigorously evaluate a novel set of domain families that is automatically generated from sequence data. Our domain family identification process, called EVEREST (EVolutionary Ensembles of REcurrent SegmenTs), begins by constructing a library of protein segments that emerge in an all vs. all pairwise sequence comparison. It then proceeds to cluster these segments into putative domain families. The selection of the best putative families is done using machine learning techniques. A statistical model is then created for each of the chosen families. This procedure is then iterated: the aforementioned statistical models are used to scan all protein sequences, to recreate a library of segments and to cluster them again.
Results: Processing the Swiss-Prot section of the UniProt Knoledgebase, release 7.2, EVEREST defines 20,230 domains, covering 85% of the amino acids of the Swiss-Prot database. EVEREST annotates 11,852 proteins (6% of the database) that are not annotated by Pfam A. In addition, in 43,086 proteins (20% of the database), EVEREST annotates a part of the protein that is not annotated by Pfam A. Performance tests show that EVEREST recovers 56% of Pfam A families and 63% of SCOP families with high accuracy, and suggests previously unknown domain families with at least 51% fidelity. EVEREST domains are often a combination of domains as defined by Pfam or SCOP and are frequently sub-domains of such domains.
Conclusion: The EVEREST process and its output domain families provide an exhaustive and validated view of the protein domain world that is automatically generated from sequence data. The EVEREST library of domain families, accessible for browsing and download at 1, provides a complementary view to that provided by other existing libraries. Furthermore, since it is automatic, the EVEREST process is scalable and we will run it in the future on larger databases as well. The EVEREST source files are available for download from the EVEREST web site.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1533870 | PMC |
http://dx.doi.org/10.1186/1471-2105-7-277 | DOI Listing |
Cell Mol Biol (Noisy-le-grand)
January 2025
Department of Pharmacology, Faculty of Pharmacy, Mersin University, Mersin, Türkiye.
Increasing evidence suggests that inhibition of receptor-interacting serine/threonine-protein kinase (RIPK) 1/RIPK3/mixed lineage kinase domain-like pseudokinase (MLKL) necrosome has protective effects in vivo models of painful conditions seen in humans associated with inflammation and demyelination in the central nervous system. However, the contribution of RIPK1-driven necroptosis to inflammatory pain remains unknown. Therefore, this study aims to determine the effect of necrostatin (Nec) -1s, a selective RIPK1 inhibitor, on lipopolysaccharide (LPS)-induced inflammatory pain and related underlying mechanisms.
View Article and Find Full Text PDFClin Transl Med
January 2025
Allergy Center, Department of Otolaryngology, Affiliated Eye and ENT Hospital, Fudan University, Shanghai, China.
Background: House dust mite (HDM) is the leading allergen for allergic rhinitis (AR). Although allergic sensitisation by inhaled allergens renders susceptible individuals prone to developing AR, the molecular mechanisms driving this process remain incompletely elucidated.
Objective: This study aimed to elucidate the molecular mechanisms underlying HDM-induced AR.
Xi Bao Yu Fen Zi Mian Yi Xue Za Zhi
January 2025
Department of Pathogen Biology and Immunology, Kunming Medical University, Kunming 650500, China. *Corresponding authors, E-mail:
The innate immune response is the first line of defense for the host against viral infections. Targeted degradation of pathogenic microorganisms through autophagy, in conjunction with pattern recognition receptors synergistically inducing the production of interferon (IFN), constitutes an important pathway for the body to resist viral infections. Rubicon, a Run domain Beclin 1-interacting and cysteine-rich domain protein, has an inhibitory effect on autophagy and IFN production.
View Article and Find Full Text PDFAm J Hum Genet
January 2025
Institute of Human Genetics, University Medical Center Hamburg-Eppendorf, 20246 Hamburg, Germany; Institute of Human Genetics, University of Regensburg, 93053 Regensburg, Germany; Institute of Clinical Human Genetics, University Hospital Regensburg, 93053 Regensburg, Germany. Electronic address:
BCL11B is a Cys2-His2 zinc-finger (C2H2-ZnF) domain-containing, DNA-binding, transcription factor with established roles in the development of various organs and tissues, primarily the immune and nervous systems. BCL11B germline variants have been associated with a variety of developmental syndromes. However, genotype-phenotype correlations along with pathophysiologic mechanisms of selected variants mostly remain elusive.
View Article and Find Full Text PDFJ Exp Bot
January 2025
Institute of Molecular Plant Biology, Department of Biology, ETH Zurich, Universitätsstrasse 2, 8092 Zurich, Switzerland.
The Arabidopsis root apical meristem is an excellent model for studying plant organ growth that involves a coordinated process of cell division, elongation, and differentiation, while each tissue type develops on its own schedule. Among these tissues, the protophloem is particularly important, differentiating early to supply nutrients and signalling molecules to the growing root tip. The OCTOPUS (OPS) protein and its homolog OPS-LIKE 2 (OPL2) are essential for proper root protophloem differentiation and, likely through this role, indirectly promote root growth.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!