ProtoNet 6.0: organizing 10 million protein sequences in a compact hierarchical family tree.

Nucleic Acids Res

School of Computer Science and Engineering, Institute of Life Sciences, The Sudarsky Center for Computational Biology, The Hebrew University of Jerusalem, 91904 Israel.

Published: January 2012

ProtoNet 6.0 (http://www.protonet.cs.huji.ac.il) is a data structure of protein families that cover the protein sequence space. These families are generated through an unsupervised bottom-up clustering algorithm. This algorithm organizes large sets of proteins in a hierarchical tree that yields high-quality protein families. The 2012 ProtoNet (Version 6.0) tree includes over 9 million proteins of which 5.5% come from UniProtKB/SwissProt and the rest from UniProtKB/TrEMBL. The hierarchical tree structure is based on an all-against-all comparison of 2.5 million representatives of UniRef50. Rigorous annotation-based quality tests prune the tree to most informative 162,088 clusters. Every high-quality cluster is assigned a ProtoName that reflects the most significant annotations of its proteins. These annotations are dominated by GO terms, UniProt/Swiss-Prot keywords and InterPro. ProtoNet 6.0 operates in a default mode. When used in the advanced mode, this data structure offers the user a view of the family tree at any desired level of resolution. Systematic comparisons with previous versions of ProtoNet are carried out. They show how our view of protein families evolves, as larger parts of the sequence space become known. ProtoNet 6.0 provides numerous tools to navigate the hierarchy of clusters.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3245180PMC
http://dx.doi.org/10.1093/nar/gkr1027DOI Listing

Publication Analysis

Top Keywords

protein families
12
family tree
8
data structure
8
sequence space
8
hierarchical tree
8
protonet
6
tree
6
protein
5
protonet organizing
4
organizing protein
4

Similar Publications

Background: Pseudomonas aeruginosa's inherent and adapted resistance makes this pathogen a serious problem for antimicrobial treatments. Furthermore, its biofilm formation ability is the most critical armor against antimicrobial therapy, and the virulence factors, on the other hand, contribute to fatal infection and other recalcitrant phenotypic characteristics. These capabilities are harmonized through cell-cell communication called Quorum Sensing (QS), which results in gene expression regulation via three major interconnected circuits: las, rhl, and pqs system.

View Article and Find Full Text PDF

Foliar-applied Zn on Catharanthus roseus enhanced production of vindoline, the main impediment precursor for costly anticancer bisindoles. A leaf-abundant CrZIP was characterized for likely role in modulating vindoline metabolism. The leaf-localized Catharanthus roseus alkaloid, vindoline, is the major impediment precursor in the production of scanty and expensive anticancer bisindoles, vinblastine and vincristine.

View Article and Find Full Text PDF

Partially hydrolyzed guar gum alleviates neurological deficits and gastrointestinal dysfunction in mice with traumatic brain injury.

Neurosurg Rev

January 2025

Department of Critical Care Medicine, The Affiliated Taizhou People's Hospital of Nanjing Medical University, Zhou shan hui shui Community,199 Hailing South Road, Taizhou, Jiangsu Province, 225300, China.

Traumatic brain injury (TBI)-associated neuroinflammation and neurotoxicity can induce gastrointestinal dysfunction through the brain-gut axis. Partially hydrolyzed guar gum (PHGG) was demonstrated to exert beneficial health effects by altering gut microbiota and short-chain fatty acids (SCFAs) production. Our study aimed to explore the effects of PHGG on gastrointestinal dysfunction in TBI mouse models.

View Article and Find Full Text PDF

A plasmid with the gene enhances the fitness of strains under laboratory conditions.

Microbiology (Reading)

January 2025

Instituto de Microbiologa, Colegio de Ciencias Biolgicas y Ambientales, Universidad San Francisco de Quito, Quito, Ecuador.

Antimicrobial resistance (AMR) is a major threat to global public health that continues to grow owing to selective pressure caused by the use and overuse of antimicrobial drugs. Resistance spread by plasmids is of special concern, as they can mediate a wide distribution of AMR genes, including those encoding extended-spectrum -lactamases (ESBLs). The CTX-M family of ESBLs has rapidly spread worldwide, playing a large role in the declining effectiveness of third-generation cephalosporins.

View Article and Find Full Text PDF

Inactivation of CaV1 and CaV2 channels.

J Gen Physiol

March 2025

Department of Physiology, University of Maryland School of Medicine, Baltimore, MD, USA.

Voltage-gated Ca2+ channels (VGCCs) are highly expressed throughout numerous biological systems and play critical roles in synaptic transmission, cardiac excitation, and muscle contraction. To perform these various functions, VGCCs are highly regulated. Inactivation comprises a critical mechanism controlling the entry of Ca2+ through these channels and constitutes an important means to regulate cellular excitability, shape action potentials, control intracellular Ca2+ levels, and contribute to long-term potentiation and depression.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!