Background: We present the Biological Observation Matrix (BIOM, pronounced "biome") format: a JSON-based file format for representing arbitrary observation by sample contingency tables with associated sample and observation metadata. As the number of categories of comparative omics data types (collectively, the "ome-ome") grows rapidly, a general format to represent and archive this data will facilitate the interoperability of existing bioinformatics tools and future meta-analyses.

Findings: The BIOM file format is supported by an independent open-source software project (the biom-format project), which initially contains Python objects that support the use and manipulation of BIOM data in Python programs, and is intended to be an open development effort where developers can submit implementations of these objects in other programming languages.

Conclusions: The BIOM file format and the biom-format project are steps toward reducing the "bioinformatics bottleneck" that is currently being experienced in diverse areas of biological sciences, and will help us move toward the next phase of comparative omics where basic science is translated into clinical and environmental applications. The BIOM file format is currently recognized as an Earth Microbiome Project Standard, and as a Candidate Standard by the Genomic Standards Consortium.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3626512PMC
http://dx.doi.org/10.1186/2047-217X-1-7DOI Listing

Publication Analysis

Top Keywords

file format
16
biom file
12
biological observation
8
observation matrix
8
matrix biom
8
comparative omics
8
biom-format project
8
format
7
biom
6
biom format
4

Similar Publications

Summary: Time-lapse 3D imaging is fundamental for studying biological processes but requires software able to handle terabytes of voxel data. Although many multidimensional viewing applications exist, they mostly lack support for heterogeneous voxel counts, datatypes, and modalities in a single timeline. Open Chrono-Morph Viewer provides a straightforward graphical user interface to quickly investigate multi-timescale datasets represented as separate volume files in the common NRRD format for compatibility between toolchains.

View Article and Find Full Text PDF

Soil data from the Barbastro-Balaguer gypsum belt, NE Spain.

Data Brief

February 2025

Estación Experimental de Aula Dei, EEAD - CSIC, Ave. Montañana 1005, 50059 Zaragoza, Spain.

The dataset [1] hosts pedological info and images of the lands -locally known as - of the outcropping gypsiferous core of the Barbastro-Balaguer anticline (Fig. 1). It stands out in the landscape for the linear reliefs due to outcrops of dipping strata with differential resistance to erosion, and also because of its whitish color (Fig.

View Article and Find Full Text PDF

Background: Environmental exposures such as airborne pollutant exposures and socio-economic indicators are increasingly recognized as important to consider when conducting clinical research using electronic health record (EHR) data or other sources of clinical data such as survey data. While numerous public sources of geospatial and spatiotemporal data are available to support such research, the data are challenging to work with due to inconsistencies in file formats and spatiotemporal resolutions, computational challenges with large file sizes, and a lack of tools for patient- or subject-level data integration.

Results: We developed FHIR PIT (HL7® Fast Healthcare Interoperability Resources Patient data Integration Tool) as an open-source, modular, data-integration software pipeline that consumes EHR data in FHIR® format and integrates the data at the level of the patient or subject with environmental exposures data of varying spatiotemporal resolutions and file formats.

View Article and Find Full Text PDF

This research was carried out to assess the concentrations of carbon monoxide (CO) and formaldehyde (HCHO) in Edo State, Southern Nigeria, using remote sensing data. A secondary data collection method was used for the assessment, and the levels of CO and HCHO were extracted annually from Google Earth Engine using information from Sentinel-5-P satellite data (COPERNISCUS/S5P/NRTI/L3_) and processed using ArcMap, Google Earth Engine, and Microsoft Excel to determine the levels of CO and HCHO in the study area from 2018 to 2023. The geometry of the study location is highlighted, saved and run, and a raster imagery file of the study area is generated after the task has been completed with a 'projection and extent' in the Geographic Tagged Image File Format (.

View Article and Find Full Text PDF

Background: Cefotaxime is a widely prescribed cephalosporin antibiotic used to treat various infections. It is mainly eliminated unchanged by the kidney through tubular secretion and glomerular filtration. Therefore, a reduction of kidney function may increase exposure to the drug and induce toxic side effects.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!