Background: Data provenance refers to the origin, processing, and movement of data. Reliable and precise knowledge about data provenance has great potential to improve reproducibility as well as quality in biomedical research and, therefore, to foster good scientific practice. However, despite the increasing interest on data provenance technologies in the literature and their implementation in other disciplines, these technologies have not yet been widely adopted in biomedical research.

Objective: The aim of this scoping review was to provide a structured overview of the body of knowledge on provenance methods in biomedical research by systematizing articles covering data provenance technologies developed for or used in this application area; describing and comparing the functionalities as well as the design of the provenance technologies used; and identifying gaps in the literature, which could provide opportunities for future research on technologies that could receive more widespread adoption.

Methods: Following a methodological framework for scoping studies and the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) guidelines, articles were identified by searching the PubMed, IEEE Xplore, and Web of Science databases and subsequently screened for eligibility. We included original articles covering software-based provenance management for scientific research published between 2010 and 2021. A set of data items was defined along the following five axes: publication metadata, application scope, provenance aspects covered, data representation, and functionalities. The data items were extracted from the articles, stored in a charting spreadsheet, and summarized in tables and figures.

Results: We identified 44 original articles published between 2010 and 2021. We found that the solutions described were heterogeneous along all axes. We also identified relationships among motivations for the use of provenance information, feature sets (capture, storage, retrieval, visualization, and analysis), and implementation details such as the data models and technologies used. The important gap that we identified is that only a few publications address the analysis of provenance data or use established provenance standards, such as PROV.

Conclusions: The heterogeneity of provenance methods, models, and implementations found in the literature points to the lack of a unified understanding of provenance concepts for biomedical data. Providing a common framework, a biomedical reference, and benchmarking data sets could foster the development of more comprehensive provenance solutions.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10132013PMC
http://dx.doi.org/10.2196/42289DOI Listing

Publication Analysis

Top Keywords

data provenance
20
provenance
14
data
13
provenance technologies
12
scoping review
8
provenance methods
8
articles covering
8
original articles
8
published 2010
8
2010 2021
8

Similar Publications

Nanosafety assessment, which seeks to evaluate the risks from exposure to nanoscale materials, spans materials synthesis and characterisation, exposure science, toxicology, and computational approaches, resulting in complex experimental workflows and diverse data types. Managing the data flows, with a focus on provenance (who generated the data and for what purpose) and quality (how was the data generated, using which protocol with which controls), as part of good research output management, is necessary to maximise the reuse potential and value of the data. Instance maps have been developed and evolved to visualise experimental nanosafety workflows and to bridge the gap between the theoretical principles of FAIR (Findable, Accessible, Interoperable and Re-usable) data and the everyday practice of experimental researchers.

View Article and Find Full Text PDF

We studied freshly collected, dried and herbarized leaf fragments of two palms, namely L. and L., most commonly used for palm-leaf manuscript (PLM) production in South (S) and Southeast Asia (SE) in order to reveal differences in their phytolith assemblages.

View Article and Find Full Text PDF

Unlabelled: This study addresses longstanding questions concerning the ore sources used in the first series of coins of ancient Athens known as the (c.540-c.500 BCE) by combining comprehensive numismatic data on 22 coins (16 new and 6 legacy analyses) with lead isotope and surface elemental measurements (MC-ICP-MS and XRF).

View Article and Find Full Text PDF

Objective: This study aimed to qualitatively study the main chemical components of apple peel in APORT, Kazakhstan, by ultra-performance liquid chromatography-quadrupole-time-of-flight mass spectrometry (UPLC-Q-TOF-MS/MS) and to compare the components of apple peels with different provenances.

Methods: An ACQUITY UPLC HSS T3 (100 mm × 2.1 mm, 1.

View Article and Find Full Text PDF

Genomic analysis of three medieval parchments from German monasteries.

Sci Rep

January 2025

Breeding Informatics Group, Department of Animal Sciences, Georg-August University, 37075, Göttingen, Germany.

In the last two decades there has been growing interest in the analysis of ancient DNA obtained from the parchment used in historic documents. The genetic insight that this data provides makes collections of historic documents an invaluable source for studying the development and spread of historical livestock populations. Additionally, the biological data may provide new information for the historical analysis that could be used to determine the provenance as well as the authenticity of these documents.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!