Background: Genome and proteome annotation pipelines are generally custom built and not easily reusable by other groups. This leads to duplication of effort, increased costs, and suboptimal annotation quality. One way to address these issues is to encourage the adoption of annotation standards and technological solutions that enable the sharing of biological knowledge and tools for genome and proteome annotation.

Results: Here we demonstrate one approach to generate portable genome and proteome annotation pipelines that users can run without recourse to custom software. This proof of concept uses our own rule-based annotation pipeline HAMAP, which provides functional annotation for protein sequences to the same depth and quality as UniProtKB/Swiss-Prot, and the World Wide Web Consortium (W3C) standards Resource Description Framework (RDF) and SPARQL (a recursive acronym for the SPARQL Protocol and RDF Query Language). We translate complex HAMAP rules into the W3C standard SPARQL 1.1 syntax, and then apply them to protein sequences in RDF format using freely available SPARQL engines. This approach supports the generation of annotation that is identical to that generated by our own in-house pipeline, using standard, off-the-shelf solutions, and is applicable to any genome or proteome annotation pipeline.

Conclusions: HAMAP SPARQL rules are freely available for download from the HAMAP FTP site, ftp://ftp.expasy.org/databases/hamap/sparql/, under the CC-BY-ND 4.0 license. The annotations generated by the rules are under the CC-BY 4.0 license. A tutorial and supplementary code to use HAMAP as SPARQL are available on GitHub at https://github.com/sib-swiss/HAMAP-SPARQL, and general documentation about HAMAP can be found on the HAMAP website at https://hamap.expasy.org.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7007698PMC
http://dx.doi.org/10.1093/gigascience/giaa003DOI Listing

Publication Analysis

Top Keywords

genome proteome
16
hamap sparql
12
proteome annotation
12
annotation
9
hamap
8
annotation pipeline
8
annotation pipelines
8
protein sequences
8
sparql
6
sparql rules-a
4

Similar Publications

People living with HIV are at higher risk of heart failure and associated left atrial remodeling compared to people without HIV. Mechanisms are unclear but have been linked to inflammation and premature aging. Here we obtain plasma proteomics concurrently with cardiac magnetic resonance imaging in two independent study populations to identify parallels between HIV-related and aging-related immune dysfunction that could contribute to atrial remodeling and clinical heart failure.

View Article and Find Full Text PDF

Establishment and characterization of a new mouse gastric carcinoma cell line, MCC.

Cancer Cell Int

January 2025

State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing, People's Republic of China.

Background: The aim of this study was to establish a primary mouse gastric carcinoma cell line.

Methods: Gastric adenocarcinoma in the body region was induced in immunocompetent BALB/c mice using N-Methyl-N-nitrosourea and a 2% NaCl solution. Fresh gastric cancer tissue samples were cultured in 1640 medium supplemented with 10% fetal bovine serum for primary culture and subculture.

View Article and Find Full Text PDF

Enhanced nano-LC-MS for analyzing dansylated oral cancer tissue metabolome dissolved in solvents with high elution strength.

Anal Chim Acta

February 2025

Department of Biochemistry and Molecular Biology, Chang Gung University, Taoyuan, 333, Taiwan; Clinical Proteomics Core Laboratory, LinKou Chang Gung Memorial Hospital, Taoyuan, 333423, Taiwan. Electronic address:

Background: Tissue metabolomics analysis, alongside genomics and proteomics, offers crucial insights into the regulatory mechanisms of tumorigenesis. To enhance metabolite detection sensitivity, chemical isotope labeling (CIL) techniques, such as dansylation, have been developed to improve metabolite separation and ionization in mass spectrometry (MS). However, the dissolution of hydrophobic derivatized metabolites in solvents with high acetonitrile content limits the use of liquid chromatography (LC) systems with small-volume reversed-phase (RP) columns.

View Article and Find Full Text PDF

Background: Chemical derivatization is a common technique in liquid chromatography-mass spectrometry (LC-MS) metabolomics used to improve the ionizability and chromatographic properties of metabolites in complex biological samples. This process facilitates better detection and separation of a wide array of compounds. The reagent 2-(4-boronobenzyl) isoquinolin-2-ium bromide (BBII), developed as a glucose labeling reagent for matrix-assisted laser desorption/ionization MS, enhances ionization for glucose and other hydroxyl metabolites.

View Article and Find Full Text PDF

Angiogenesis begins as endothelial cells migrate, forming a sprouting tip and subsequent growth-rich stalk cells. Here, we present a protocol for transcriptomic and epigenomic analyses of tip-like cells in cultured endothelial cells. We describe steps for stimulating human umbilical vein endothelial cells (HUVECs) with vascular endothelial cell growth factor (VEGF) to generate tip-like cells.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!