AI Article Synopsis

  • The paper investigates how to track the onset and progression of diseases over time by focusing on data from the UK Biobank and the use of algorithms to identify risk factors that change over time.
  • Researchers developed a method to consolidate health data, specifically targeting diabetes complications like cardiovascular disease, kidney disease, and retinopathy, while ensuring relevant definitions and expert validation in the phenotyping process.
  • The study successfully identified tens of thousands of diabetes patients and demonstrated reliable risk prediction for various complications, emphasizing the importance of a comprehensive approach to understanding disease progression.

Article Abstract

Objective: Modern healthcare data reflect massive multi-level and multi-scale information collected over many years. The majority of the existing phenotyping algorithms use case-control definitions of disease. This paper aims to study the time to disease onset and progression and identify the time-varying risk factors that drive them.

Materials And Methods: We developed an algorithmic approach to phenotyping the incidence of diseases by consolidating data sources from the UK Biobank (UKB), including primary care electronic health records (EHRs). We focused on defining events, event dates, and their censoring time, including relevant terms and existing phenotypes, excluding generic, rare, or semantically distant terms, forward-mapping terminology terms, and expert review. We applied our approach to phenotyping diabetes complications, including a composite cardiovascular disease (CVD) outcome, diabetic kidney disease (DKD), and diabetic retinopathy (DR), in the UKB study.

Results: We identified 49 049 participants with diabetes. Among them, 1023 had type 1 diabetes (T1D), and 40 193 had type 2 diabetes (T2D). A total of 23 833 diabetes subjects had linked primary care records. There were 3237, 3113, and 4922 patients with CVD, DKD, and DR events, respectively. The risk prediction performance for each outcome was assessed, and our results are consistent with the prediction area under the ROC (receiver operating characteristic) curve (AUC) of standard risk prediction models using cohort studies.

Discussion And Conclusion: Our publicly available pipeline and platform enable streamlined curation of incidence events, identification of time-varying risk factors underlying disease progression, and the definition of a relevant cohort for time-to-event analyses. These important steps need to be considered simultaneously to study disease progression.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9912368PMC
http://dx.doi.org/10.1093/jamiaopen/ooad006DOI Listing

Publication Analysis

Top Keywords

disease progression
12
risk factors
12
diabetes complications
8
time-varying risk
8
approach phenotyping
8
primary care
8
type diabetes
8
risk prediction
8
disease
7
diabetes
6

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!