Expert curation is essential to capture knowledge of enzyme functions from the scientific literature in FAIR open knowledgebases but cannot keep pace with the rate of new discoveries and new publications. In this work we present EnzChemRED, for Enzyme Chemistry Relation Extraction Dataset, a new training and benchmarking dataset to support the development of Natural Language Processing (NLP) methods such as (large) language models that can assist enzyme curation. EnzChemRED consists of 1,210 expert curated PubMed abstracts where enzymes and the chemical reactions they catalyze are annotated using identifiers from the protein knowledgebase UniProtKB and the chemical ontology ChEBI.
View Article and Find Full Text PDFExpert curation is essential to capture knowledge of enzyme functions from the scientific literature in FAIR open knowledgebases but cannot keep pace with the rate of new discoveries and new publications. In this work we present EnzChemRED, for Enzyme Chemistry Relation Extraction Dataset, a new training and benchmarking dataset to support the development of Natural Language Processing (NLP) methods such as (large) language models that can assist enzyme curation. EnzChemRED consists of 1,210 expert curated PubMed abstracts in which enzymes and the chemical reactions they catalyze are annotated using identifiers from the UniProt Knowledgebase (UniProtKB) and the ontology of Chemical Entities of Biological Interest (ChEBI).
View Article and Find Full Text PDFThe Gene Ontology (GO) knowledgebase (http://geneontology.org) is a comprehensive resource concerning the functions of genes and gene products (proteins and noncoding RNAs). GO annotations cover genes from organisms across the tree of life as well as viruses, though most gene function knowledge currently derives from experiments carried out in a relatively small number of model organisms.
View Article and Find Full Text PDFUnlabelled: SwissBioPics (www.swissbiopics.org) is a freely available resource of interactive, high-resolution cell images designed for the visualization of subcellular location data.
View Article and Find Full Text PDFThe UniProt Knowledgebase UniProtKB is a comprehensive, high-quality, and freely accessible resource of protein sequences and functional annotation that covers genomes and proteomes from tens of thousands of taxa, including a broad range of plants and microorganisms producing natural products of medical, nutritional, and agronomical interest. Here we describe work that enhances the utility of UniProtKB as a support for both the study of natural products and for their discovery. The foundation of this work is an improved representation of natural product metabolism in UniProtKB using Rhea, an expert-curated knowledgebase of biochemical reactions, that is built on the ChEBI (Chemical Entities of Biological Interest) ontology of small molecules.
View Article and Find Full Text PDFMotivation: The number of protein records in the UniProt Knowledgebase (UniProtKB: https://www.uniprot.org) continues to grow rapidly as a result of genome sequencing and the prediction of protein-coding genes.
View Article and Find Full Text PDFBackground: Genome and proteome annotation pipelines are generally custom built and not easily reusable by other groups. This leads to duplication of effort, increased costs, and suboptimal annotation quality. One way to address these issues is to encourage the adoption of annotation standards and technological solutions that enable the sharing of biological knowledge and tools for genome and proteome annotation.
View Article and Find Full Text PDFHAMAP (High-quality Automated and Manual Annotation of Proteins--available at http://hamap.expasy.org/) is a system for the automatic classification and annotation of protein sequences.
View Article and Find Full Text PDFHAMAP (High-quality Automated and Manual Annotation of Proteins-available at http://hamap.expasy.org/) is a system for the classification and annotation of protein sequences.
View Article and Find Full Text PDFThe GO annotation dataset provided by the UniProt Consortium (GOA: http://www.ebi.ac.
View Article and Find Full Text PDFBackground: Millions of humans and animals suffer from superficial infections caused by a group of highly specialized filamentous fungi, the dermatophytes, which exclusively infect keratinized host structures. To provide broad insights into the molecular basis of the pathogenicity-associated traits, we report the first genome sequences of two closely phylogenetically related dermatophytes, Arthroderma benhamiae and Trichophyton verrucosum, both of which induce highly inflammatory infections in humans.
Results: 97% of the 22.
In recent years, the general understanding of nutrient sensing and signalling, as well as the knowledge about responses triggered by altered nutrient availability have greatly advanced. While initial studies were directed to top-down elucidation of single nutrient-induced pathways, recent investigations place the individual signalling pathways into signalling networks and pursue the identification of converging effector branches that orchestrate the dynamical responses to nutritional cues. In this review, we focus on Rim15, a protein kinase required in yeast for the proper entry into stationary phase (G0).
View Article and Find Full Text PDFEukaryotic cell proliferation is controlled by growth factors and essential nutrients. In their absence, cells may enter into a quiescent state (G0). In Saccharomyces cerevisiae, the conserved protein kinase A (PKA) and rapamycin-sensitive TOR (TORC1) pathways antagonize G0 entry in response to carbon and/or nitrogen availability primarily by inhibiting the PAS kinase Rim15 function.
View Article and Find Full Text PDFRegulated interactions between microtubules (MTs) and the cell cortex control MT dynamics and position the mitotic spindle. In eukaryotic cells, the adenomatous polyposis coli/Kar9p and dynein/dynactin pathways are involved in guiding MT plus ends and MT sliding along the cortex, respectively. Here we identify Bud14p as a novel cortical activator of the dynein/dynactin complex in budding yeast.
View Article and Find Full Text PDFThe Ccr4-Not complex is a conserved global regulator of gene expression, which serves as a regulatory platform that senses and/or transmits nutrient and stress signals to various downstream effectors. Presumed effectors of this complex in yeast are TFIID, a general transcription factor that associates with the core promoter, and Msn2, a key transcription factor that regulates expression of stress-responsive element (STRE)-controlled genes. Here we show that the constitutively high level of STRE-driven expression in ccr4-not mutants results from two independent effects.
View Article and Find Full Text PDFThe highly conserved Tor kinases (TOR) and the protein kinase A (PKA) pathway regulate cell proliferation in response to growth factors and/or nutrients. In Saccharomyces cerevisiae, loss of either TOR or PKA causes cells to arrest growth early in G(1) and to enter G(0) by mechanisms that are poorly understood. Here we demonstrate that the protein kinase Rim15 is required for entry into G(0) following inactivation of TOR and/or PKA.
View Article and Find Full Text PDF