Although there are considerable site-based data for individual or groups of ecosystems, these datasets are widely scattered, have different data formats and conventions, and often have limited accessibility. At the broader scale, national datasets exist for a large number of geospatial features of land, water, and air that are needed to fully understand variation among these ecosystems. However, such datasets originate from different sources and have different spatial and temporal resolutions. By taking an open-science perspective and by combining site-based ecosystem datasets and national geospatial datasets, science gains the ability to ask important research questions related to grand environmental challenges that operate at broad scales. Documentation of such complicated database integration efforts, through peer-reviewed papers, is recommended to foster reproducibility and future use of the integrated database. Here, we describe the major steps, challenges, and considerations in building an integrated database of lake ecosystems, called LAGOS (LAke multi-scaled GeOSpatial and temporal database), that was developed at the sub-continental study extent of 17 US states (1,800,000 km(2)). LAGOS includes two modules: LAGOSGEO, with geospatial data on every lake with surface area larger than 4 ha in the study extent (~50,000 lakes), including climate, atmospheric deposition, land use/cover, hydrology, geology, and topography measured across a range of spatial and temporal extents; and LAGOSLIMNO, with lake water quality data compiled from ~100 individual datasets for a subset of lakes in the study extent (~10,000 lakes). Procedures for the integration of datasets included: creating a flexible database design; authoring and integrating metadata; documenting data provenance; quantifying spatial measures of geographic data; quality-controlling integrated and derived data; and extensively documenting the database. Our procedures make a large, complex, and integrated database reproducible and extensible, allowing users to ask new research questions with the existing database or through the addition of new data. The largest challenge of this task was the heterogeneity of the data, formats, and metadata. Many steps of data integration need manual input from experts in diverse fields, requiring close collaboration.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4488039PMC
http://dx.doi.org/10.1186/s13742-015-0067-4DOI Listing

Publication Analysis

Top Keywords

data
12
integrated database
12
study extent
12
database
9
multi-scaled geospatial
8
geospatial temporal
8
ecosystems datasets
8
data formats
8
spatial temporal
8
datasets
7

Similar Publications

Introduction: Wearables are electronic devices worn on the body to collect health data. These devices, like smartwatches and patches, use sensors to gather information on various health parameters. This review highlights current use and the potential benefit of wearable technology in patients with inflammatory bowel disease (IBD).

View Article and Find Full Text PDF

Introduction: Antibody-drug conjugates (ADCs) are a rapidly evolving class of anti-cancer drugs with a significant impact on management of hematological malignancies including diffuse large B-cell lymphoma (DLBCL). ADCs combine a cytotoxic drug (a.k.

View Article and Find Full Text PDF

Gestational diabetes mellitus (GDM) is a metabolic disorder that arises during pregnancy and heightens the risk of placental dysplasia. Ginsenoside Re (Re) may stabilize insulin and glucagon to regulate glucose levels, which may improve diabetes-associated diseases. This study aims to investigate the mechanism of Re in high glucose (HG)-induced apoptosis of trophoblasts through endoplasmic reticulum stress (ERS)-related protein CHOP/GADD153.

View Article and Find Full Text PDF

The association between multilingual experience factors and cognitive functioning in older adults: A Lifelines study.

J Gerontol B Psychol Sci Soc Sci

January 2025

Linguistics and English as a Second Language, Faculty of Arts, University of Groningen, Groningen, the Netherlands.

Objectives: The complex life experience of speaking two or more languages has been suggested to preserve cognition in older adulthood. This study aimed to investigate this further by examining the relationship between multilingual experience variables and cognitive functioning in a large cohort of older adults in the diversely multilingual north of the Netherlands.

Method: 11,332 older individuals participating in the Lifelines Cohort Study completed a language experience questionnaire.

View Article and Find Full Text PDF

Background: Predicting response to targeted cancer therapies increasingly relies on both simple and complex genetic biomarkers. Comprehensive genomic profiling using high-throughput assays must be evaluated for reproducibility and accuracy compared with existing methods.

Methods: This study is a multicenter evaluation of the Oncomine™ Comprehensive Assay Plus (OCA Plus) Pan-Cancer Research Panel for comprehensive genomic profiling of solid tumors.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!