AI development in biotechnology relies on high-quality data to train and validate algorithms. The FAIR principles (Findable, Accessible, Interoperable, and Reusable) and regulatory frameworks such as the In Vitro Diagnostic Regulation (IVDR) and the Medical Device Regulation (MDR) specify requirements on specimen and data provenance to ensure the quality and traceability of data used in AI development. In this paper, a framework is presented for recording and publishing provenance information to meet these requirements.
View Article and Find Full Text PDFProvenance is information describing the lineage of an object, such as a dataset or biological material. Since these objects can be passed between organizations, each organization can document only parts of the objects life cycle. As a result, interconnection of distributed provenance parts forms distributed provenance chains.
View Article and Find Full Text PDFStud Health Technol Inform
May 2022
The distributed nature of modern research emphasizes the importance of collecting and sharing the history of digital and physical material, to improve the reproducibility of experiments and the quality and reusability of results. Yet, the application of the current methodologies to record provenance information is largely scattered, leading to silos of provenance information at different granularities. To tackle this fragmentation, we developed the Common Provenance Model, a set of guidelines for the generation of interoperable provenance information, and to allow the reconstruction and the navigation of a continuous provenance chain.
View Article and Find Full Text PDFBackground: The International Society of Urological Pathology (ISUP) revised the Gleason system in 2005 and 2014. The impact of these changes on prostate cancer (PCa) prognostication remains unclear.
Objective: To evaluate if the ISUP 2014 Gleason score (GS) predicts PCa death better than the pre-2005 GS, and if additional histopathological information can further improve PCa death prediction.
The FAIR Principles are a set of recommendations that aim to underpin knowledge discovery and integration by making the research outcomes Findable, Accessible, Interoperable and Reusable. These guidelines encourage the accurate recording and exchange of data, coupled with contextual information about their creation, expressed in domain-specific standards and machine-readable formats. This paper analyses the potential support to FAIRness of the openEHR specifications and reference implementation, by theoretically assessing their compliance with each of the 15 FAIR principles.
View Article and Find Full Text PDFStud Health Technol Inform
May 2021
The data produced during a research project are too often collected for the sole purpose of the study, therefore hindering profitable reuse in similar contexts. The growing need to counteract this trend has recently led to the formalization of the FAIR principles that aim to make (meta)data Findable, Accessible, Interoperable and Reusable, for humans and machines. Since their introduction, efforts are ongoing to encourage FAIR principles adoption and to implement solutions based on them.
View Article and Find Full Text PDFVirtual microscopy (VM) holds promise to reduce subjectivity as well as intra- and inter-observer variability for the histopathological evaluation of prostate cancer. We evaluated (i) the repeatability (intra-observer agreement) and reproducibility (inter-observer agreement) of the 2014 Gleason grading system and other selected features using standard light microscopy (LM) and an internally developed VM system, and (ii) the interchangeability of LM and VM. Two uro-pathologists reviewed 413 cores from 60 Swedish men diagnosed with non-metastatic prostate cancer 1998-2014.
View Article and Find Full Text PDFCurrent high-throughput sequencing technologies allow us to acquire entire genomes in a very short time and at a relatively sustainable cost, thus resulting in an increasing diffusion of genetic test capabilities, in specialized clinical laboratories and research centers. In contrast, it is still limited the impact of genomic information on clinical decisions, as an effective interpretation is a challenging task. From the technological point of view, genomic data are big in size, have a complex granular nature and strongly depend on the computational steps of the generation and processing workflows.
View Article and Find Full Text PDFIn this paper, we describe the Prognostic Factors for Mortality in Prostate Cancer (ProMort) study and use it to demonstrate how the weighted likelihood method can be used in nested case-control studies to estimate both relative and absolute risks in the competing-risks setting. ProMort is a case-control study nested within the National Prostate Cancer Register (NPCR) of Sweden, comprising 1,710 men diagnosed with low- or intermediate-risk prostate cancer between 1998 and 2011 who died from prostate cancer (cases) and 1,710 matched controls. Cause-specific hazard ratios and cumulative incidence functions (CIFs) for prostate cancer death were estimated in ProMort using weighted flexible parametric models and compared with the corresponding estimates from the NPCR cohort.
View Article and Find Full Text PDFPurpose: The increasing usage of high throughput sequencing in personalized medicine brings new challenges to the realm of healthcare informatics. Patient records need to accommodate data of unprecedented size and complexity as well as keep track of their production process. In this work we present a solution for integrating genomic data into electronic health records via openEHR archetypes.
View Article and Find Full Text PDF