The Universal Protein Resource (UniProt) is a comprehensive resource for protein sequence and annotation data. The UniProt website receives about 800,000 unique visitors per month and is the primary means to access UniProt. It provides 10 searchable datasets and four main tools.
View Article and Find Full Text PDFThe Gene Ontology (GO) knowledgebase (http://geneontology.org) is a comprehensive resource concerning the functions of genes and gene products (proteins and noncoding RNAs). GO annotations cover genes from organisms across the tree of life as well as viruses, though most gene function knowledge currently derives from experiments carried out in a relatively small number of model organisms.
View Article and Find Full Text PDFPhosphatases play an essential role in the regulation of protein phosphorylation. Less abundant than kinases, many phosphatases are components of one or more macromolecular complexes with different substrate specificities and specific functionalities. The expert scientific curation of phosphatase complexes for the UniProt and Complex Portal databases supports the whole scientific community by collating and organising small- and large-scale experimental data from the scientific literature into context-specific central resources, where the data can be freely accessed and used to further academic and translational research.
View Article and Find Full Text PDFDNA synthesis during replication or repair is a fundamental cellular process that is catalyzed by a set of evolutionary conserved polymerases. Despite a large body of research, the DNA polymerases of have not yet been systematically reviewed, leading to inconsistencies in their nomenclature, shortcomings in their functional (Gene Ontology, GO) annotations and an under-appreciation of the extent of their characterization. Here, we describe the complete set of DNA polymerases in , applying nomenclature already in widespread use in other species, and improving their functional annotation.
View Article and Find Full Text PDFThe universal protein knowledgebase (UniProtKB) collects and centralises functional information on proteins across a wide range of species. In addition to the functional information added to all protein entries, for enzymes, which represent 20-40% of most proteomes, UniProtKB provides additional information about Enzyme Commission classification, catalytic activity, cofactors, enzyme regulation, kinetics and pathways, all based on critical assessment of published experimental data. Computer-based analysis and structural data are used to enrich the annotation of the sequence through the identification of active sites and binding sites.
View Article and Find Full Text PDFCurr Protoc Bioinformatics
June 2018
Public availability of biological sequences is essential for their widespread access and use by the research community. The Universal Protein Resource (UniProt) is a comprehensive resource for protein sequence and functional data. While most protein sequences entering UniProt are imported from other source databases containing nucleotide or 3-D structure data, protein sequences determined at the protein level can be submitted directly to UniProt.
View Article and Find Full Text PDFMotivation: Biological knowledgebases, such as UniProtKB/Swiss-Prot, constitute an essential component of daily scientific research by offering distilled, summarized and computable knowledge extracted from the literature by expert curators. While knowledgebases play an increasingly important role in the scientific community, their ability to keep up with the growth of biomedical literature is under scrutiny. Using UniProtKB/Swiss-Prot as a case study, we address this concern via multiple literature triage approaches.
View Article and Find Full Text PDFProtein kinases form one of the largest protein families and are found in all species, from viruses to humans. They catalyze the reversible phosphorylation of proteins, often modifying their activity and localization. They are implicated in virtually all cellular processes and are one of the most intensively studied protein families.
View Article and Find Full Text PDFBackground: A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging.
View Article and Find Full Text PDFAdvances in high-throughput and advanced technologies allow researchers to routinely perform whole genome and proteome analysis. For this purpose, they need high-quality resources providing comprehensive gene and protein sets for their organisms of interest. Using the example of the human proteome, we will describe the content of a complete proteome in the UniProt Knowledgebase (UniProtKB).
View Article and Find Full Text PDFThe Universal Protein Resource (UniProt) is a comprehensive resource for protein sequence and annotation data. The UniProt Web site receives ∼400,000 unique visitors per month and is the primary means to access UniProt. It provides ten searchable datasets and three main tools.
View Article and Find Full Text PDFUniProtKB/Swiss-Prot provides expert curation with information extracted from literature and curator-evaluated computational analysis. As knowledgebases continue to play an increasingly important role in scientific research, a number of studies have evaluated their accuracy and revealed various errors. While some are curation errors, others are the result of incorrect information published in the scientific literature.
View Article and Find Full Text PDFThe GO annotation dataset provided by the UniProt Consortium (GOA: http://www.ebi.ac.
View Article and Find Full Text PDFThe UniProt Knowledgebase (UniProtKB) acts as a central hub of protein knowledge by providing a unified view of protein sequence and functional information. Manual and automatic annotation procedures are used to add data directly to the database while extensive cross-referencing to more than 120 external databases provides access to additional relevant information in more specialized data collections. UniProtKB also integrates a range of data from other resources.
View Article and Find Full Text PDFMotivation: The advent of sequencing and structural genomics projects has provided a dramatic boost in the number of uncharacterized protein structures and sequences. Consequently, many computational tools have been developed to help elucidate protein function. However, such services are spread throughout the world, often with standalone web pages.
View Article and Find Full Text PDFThe Universal Protein Resource (UniProt) provides a central resource on protein sequences and functional annotation with three database components, each addressing a key need in protein bioinformatics. The UniProt Knowledgebase (UniProtKB), comprising the manually annotated UniProtKB/Swiss-Prot section and the automatically annotated UniProtKB/TrEMBL section, is the preeminent storehouse of protein annotation. The extensive cross-references, functional and feature annotations and literature-based evidence attribution enable scientists to analyse proteins and query across databases.
View Article and Find Full Text PDFBackground: The Gene Ontology Annotation (GOA) database http://www.ebi.ac.
View Article and Find Full Text PDFThe Universal Protein Resource (UniProt) provides the scientific community with a single, centralized, authoritative resource for protein sequences and functional information. Formed by uniting the Swiss-Prot, TrEMBL and PIR protein database activities, the UniProt consortium produces three layers of protein sequence databases: the UniProt Archive (UniParc), the UniProt Knowledgebase (UniProt) and the UniProt Reference (UniRef) databases. The UniProt Knowledgebase is a comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase with extensive cross-references.
View Article and Find Full Text PDFTo provide the scientific community with a single, centralized, authoritative resource for protein sequences and functional information, the Swiss-Prot, TrEMBL and PIR protein database activities have united to form the Universal Protein Knowledgebase (UniProt) consortium. Our mission is to provide a comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase, with extensive cross-references and query interfaces. The central database will have two sections, corresponding to the familiar Swiss-Prot (fully manually curated entries) and TrEMBL (enriched with automated classification, annotation and extensive cross-references).
View Article and Find Full Text PDF