OMeta: an ontology-based, data-driven metadata tracking system.

BMC Bioinformatics

J. Craig Venter Institute, 9605 Medical Center Drive, Suite 150, Rockville, MD, 20850, USA.

Published: January 2019

Background: The development of high-throughput sequencing and analysis has accelerated multi-omics studies of thousands of microbial species, metagenomes, and infectious disease pathogens. Omics studies are enabling genotype-phenotype association studies which identify genetic determinants of pathogen virulence and drug resistance, as well as phylogenetic studies designed to track the origin and spread of disease outbreaks. These omics studies are complex and often employ multiple assay technologies including genomics, metagenomics, transcriptomics, proteomics, and metabolomics. To maximize the impact of omics studies, it is essential that data be accompanied by detailed contextual metadata (e.g., specimen, spatial-temporal, phenotypic characteristics) in clear, organized, and consistent formats. Over the years, many metadata standards developed by various metadata standards initiatives have arisen; the Genomic Standards Consortium's minimal information standards (MIxS), the GSCID/BRC Project and Sample Application Standard. Some tools exist for tracking metadata, but they do not provide event based capabilities to configure, collect, validate, and distribute metadata. To address this gap in the scientific community, an event based data-driven application, OMeta, was created that allows users to quickly configure, collect, validate, distribute, and integrate metadata.

Results: A data-driven web application, OMeta, has been developed for use by researchers consisting of a browser-based interface, a command-line interface (CLI), and server-side components that provide an intuitive platform for configuring, capturing, viewing, and sharing metadata. Project and sample metadata can be set based on existing standards or based on projects goals. Recorded information includes details on the biological samples, procedures, protocols, and experimental technologies, etc. This information can be organized based on events, including sample collection, sample quantification, sequencing assay, and analysis results. OMeta enables configuration in various presentation types: checkbox, file, drop-box, ontology, and fields can be configured to use the National Center for Biomedical Ontology (NCBO), a biomedical ontology server. Furthermore, OMeta maintains a complete audit trail of all changes made by users and allows metadata export in comma separated value (CSV) format for convenient deposition of data into public databases.

Conclusions: We present, OMeta, a web-based software application that is built on data-driven principles for configuring and customizing data standards, capturing, curating, and sharing metadata.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6322262PMC
http://dx.doi.org/10.1186/s12859-018-2580-9DOI Listing

Publication Analysis

Top Keywords

omics studies
12
metadata
10
metadata standards
8
project sample
8
event based
8
configure collect
8
collect validate
8
validate distribute
8
application ometa
8
sharing metadata
8

Similar Publications

Patient-derived organoids represent a novel platform to recapitulate the cancer cells in the patient tissue. While cancer heterogeneity has been extensively studied by a number of omics approaches, little is known about the spatiotemporal kinase activity dynamics. Here we applied a live imaging approach to organoids derived from 10 pancreatic ductal adenocarcinoma (PDAC) patients to comprehensively understand their heterogeneous growth potential and drug responses.

View Article and Find Full Text PDF

Background: Studies have reported clinical heterogeneity between right-sided colon cancer (RCC) and left-sided colon cancer (LCC). However, none of these studies used multi-omics analysis combining genetic regulation, microbiota, and metabolites to explain the site-specific difference.

Methods: Here, 494 participants from a 16S rRNA gene sequencing cohort (50 RCC, 114 LCC, and 100 healthy controls) and a multi-omics cohort (63 RCC, 79 LCC, and 88 healthy controls) were analyzed.

View Article and Find Full Text PDF

The lysine-specific demethylase 5 (KDM5) family, a key post-translational modification of chromatin, can shape tumor immune microenvironment. Here, we performed an extensive clinical and bioinformatic analysis to explore the association between KDM5 mutation and tumor immunity and its impact on the outcomes in pan-cancer immunotherapy. In 2943 patients across 12 tumor types treated with immune checkpoint inhibitors, KDM5-mutant tumors were associated with favorable overall survival (hazard ratio, 0.

View Article and Find Full Text PDF

Survival prediction of glioblastoma patients using machine learning and deep learning: a systematic review.

BMC Cancer

December 2024

Department of Data Science, Faculty of Interdisciplinary Science and Technology, Tarbiat Modares University, Tehran, Iran.

Glioblastoma Multiforme (GBM), classified as a grade IV glioma by the World Health Organization (WHO), is a prevalent and notably aggressive form of brain tumor derived from glial cells. It stands as one of the most severe forms of primary brain cancer in humans. The median survival time of GBM patients is only 12-15 months, making it the most lethal type of brain tumor.

View Article and Find Full Text PDF

Understanding the triacylglycerol-based carbon anabolic differentiation in Cyperus esculentus and Cyperus rotundus developing tubers via transcriptomic and metabolomic approaches.

BMC Plant Biol

December 2024

College of Agronomy and Biotechnology, Key Laboratory for Crop Production and Smart Agriculture of Yunnan Province, Yunnan Agricultural University, Kunming, 650201, China.

Background: Yellow nutsedge (Cyperus esculentus, known as 'YouShaDou' in China, YSD) and purple nutsedge (Cyperus rotundus, known as 'XiangFuZi' in China, XFZ), closely related Cyperaceae species, exhibit significant differences in triacylglycerol (TAG) accumulation within their tubers, a key factor in carbon flux repartitioning that highly impact the total lipid, carbohydrate and protein metabolisms. Previous studies have attempted to elucidate the carbon anabolic discrepancies between these two species, however, a lack of comprehensive genome-wide annotation has hindered a detailed understanding of the underlying molecular mechanisms.

Results: This study utilizes transcriptomic analyses, supported by a comprehensive YSD reference genome, and metabolomic profiling to uncover the mechanisms underlying the major carbon perturbations between the developing tubers of YSD and XFZ germplasms harvested in Yunnan province, China, where the plant biodiveristy is renowned worldwide and may contain more genetic variations relative to their counterparts in other places.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!