A patient safety knowledge graph supporting vaccine product development.

BMC Med Inform Decis Mak

AstraZeneca PLC, Gaithersburg, Maryland, USA.

Published: January 2024

Background: Knowledge graphs are well-suited for modeling complex, unstructured, and multi-source data and facilitating their analysis. During the COVID-19 pandemic, adverse event data were integrated into a knowledge graph to support vaccine safety surveillance and nimbly respond to urgent health authority questions. Here, we provide details of this post-marketing safety system using public data sources. In addition to challenges with varied data representations, adverse event reporting on the COVID-19 vaccines generated an unprecedented volume of data; an order of magnitude larger than adverse events for all previous vaccines. The Patient Safety Knowledge Graph (PSKG) is a robust data store to accommodate the volume of adverse event data and harmonize primary surveillance data sources.

Methods: We designed a semantic model to represent key safety concepts. We built an extract-transform-load (ETL) data pipeline to parse and import primary public data sources; align key elements such as vaccine names; integrated the Medical Dictionary for Regulatory Activities (MedDRA); and applied quality metrics. PSKG is deployed in a Neo4J graph database, and made available via a web interface and Application Programming Interfaces (APIs).

Results: We import and align adverse event data and vaccine exposure data from 250 countries on a weekly basis, producing a graph with 4,340,980 nodes and 30,544,475 edges as of July 1, 2022. PSKG is used for ad-hoc analyses and periodic reporting for several widely available COVID-19 vaccines. Analysis code using the knowledge graph is 80% shorter than an equivalent implementation written entirely in Python, and runs over 200 times faster.

Conclusions: Organizing safety data into a concise model of nodes, properties, and edge relationships has greatly simplified analysis code by removing complex parsing and transformation algorithms from individual analyses and instead managing these centrally. The adoption of the knowledge graph transformed how the team answers key scientific and medical questions. Whereas previously an analysis would involve aggregating and transforming primary datasets from scratch to answer a specific question, the team can now iterate easily and respond as quickly as requests evolve (e.g., "Produce vaccine-X safety profile for adverse event-Y by country instead of age-range").

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10768450PMC
http://dx.doi.org/10.1186/s12911-023-02409-8DOI Listing

Publication Analysis

Top Keywords

knowledge graph
20
adverse event
16
data
13
event data
12
patient safety
8
safety knowledge
8
public data
8
data sources
8
reporting covid-19
8
covid-19 vaccines
8

Similar Publications

Adolescence is a period in which peer problems and emotional symptoms markedly increase in prevalence. However, the causal mechanisms regarding how peer problems cause emotional symptoms at a behavioral level and vice versa remain unknown. To address this gap, the present study investigated the longitudinal network of peer problems and emotional symptoms among Australian adolescents aged 12-14 years.

View Article and Find Full Text PDF

Diversity of complementary diet and early food allergy risk.

Pediatr Allergy Immunol

January 2025

Department of Clinical Sciences, Pediatrics, Umeå University, Umeå, Sweden.

Introduction: Diet diversity (DD) in infancy may be protective for early food allergy (FA) but there is limited knowledge about how DD incorporating consumption frequency influences FA risk.

Methods: Three measures of DD were investigated in 2060 infants at 6 and/or at 9 months of age within the NorthPop Birth Cohort Study: a weighted DD score based on intake frequency, the number of introduced foods, and the number of introduced allergenic foods. In multivariable logistic regression models based on directed acyclic graphs, associations to parentally reported physician-diagnosed FA at age 9 and 18 months were estimated, including sensitivity and stratified analyses.

View Article and Find Full Text PDF

Predicting phage-host interaction via hyperbolic Poincaré graph embedding and large-scale protein language technique.

iScience

January 2025

Key Laboratory of Resources Biology and Biotechnology in Western China, Ministry of Education, Provincial Key Laboratory of Biotechnology of Shaanxi Province, the College of Life Sciences, Northwest University, Xi'an 710069, China.

Bacteriophages (phages) are increasingly viewed as a promising alternative for the treatment of antibiotic-resistant bacterial infections. However, the diversity of host ranges complicates the identification of target phages. Existing computational tools often fail to accurately identify phages across different bacterial species.

View Article and Find Full Text PDF

Background: Drug-drug interactions (DDIs) especially antagonistic ones present significant risks to patient safety, underscoring the urgent need for reliable prediction methods. Recently, substructure-based DDI prediction has garnered much attention due to the dominant influence of functional groups and substructures on drug properties. However, existing approaches face challenges regarding the insufficient interpretability of identified substructures and the isolation of chemical substructures.

View Article and Find Full Text PDF

ARCH: Large-scale knowledge graph via aggregated narrative codified health records analysis.

J Biomed Inform

January 2025

Harvard T.H. Chan School of Public Health, 677 Huntington Ave, Boston, 02115, MA, USA; VA Boston Healthcare System, 150 S Huntington Ave, Boston, 02130, MA, USA. Electronic address:

Objective: Electronic health record (EHR) systems contain a wealth of clinical data stored as both codified data and free-text narrative notes (NLP). The complexity of EHR presents challenges in feature representation, information extraction, and uncertainty quantification. To address these challenges, we proposed an efficient Aggregated naRrative Codified Health (ARCH) records analysis to generate a large-scale knowledge graph (KG) for a comprehensive set of EHR codified and narrative features.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!