Background: Type 2 diabetes mellitus (T2DM) is a significant global public health concern that has steadily increased over the past few decades. Thus, this study aimed to predict the incidence of T2DM within 5 years and the risk of mortality following the onset of T2DM. Data from three independent cohorts worldwide were used.

Methods: We utilized data from three independent, large-scale, general population-based, and worldwide cohort studies. The Korean cohort (NHIS-NSC cohort; discovery cohort; n = 973,303), conducted between 1 January, 2002 and 31 December, 2013, was used for training and internal validation, whereas the Japanese cohort (JMDC cohort; validation cohort A; n = 12,143,715) and UK cohort (UK Biobank; validation cohort B; n = 416,656) were used for external validation. We employed various machine learning (ML)-based models, using 18 features, to predict the incidence of T2DM within five years of regular health checkups and calculated the Shapley Additive Explanation (SHAP) values. To ensure the robustness of our ML-based prediction model, we investigated the potential association between the model probability divided into tertiles and the risk of mortality following the onset of T2DM.

Findings: In the discovery cohort, the ensemble model using voting with logistic regression and adaptive boosting achieved a balanced accuracy of 72.6% and an area under the receiver operating characteristics curve (AUROC) of 0.792. The SHAP value analysis of our proposed model revealed that age was the most important predictor of incident T2DM, followed by fasting blood glucose, hemoglobin, γ-glutamyl transferase level, and body mass index. The model probability is associated with an increased risk of mortality (T1: adjusted hazard ratio, 2.82 [95% CI, 2.01-3.94]; T2: 3.89 [2.74-5.53]; and T3: 7.73 [5.37-11.12]). Similar patterns and trends were observed in the validation cohorts (T1: 1.74 [1.49-2.03], T2: 1.97 [1.69-2.30], and T3: 3.31 [2.82-3.38] in validation cohort A; T1: 1.33 [1.03-1.71], T2: 1.54 [1.21-1.96], and T3: 1.73 [1.36-2.20] in validation cohort B).

Interpretation: This study derived and validated an ML-based model to predict the incidence of T2DM within 5 years across three countries (South Korea, Japan, and the UK), showing that the model probability is associated with an increased risk of mortality.

Funding: Institute of Information & Communications Technology Planning & Evaluation, South Korea.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11787438PMC
http://dx.doi.org/10.1016/j.eclinm.2025.103069DOI Listing

Publication Analysis

Top Keywords

validation cohort
16
three independent
12
south korea
12
predict incidence
12
incidence t2dm
12
t2dm years
12
risk mortality
12
cohort
12
model probability
12
prediction model
8

Similar Publications

Background: Hypertension is a major global health issue and a significant modifiable risk factor for cardiovascular diseases, contributing to a substantial socioeconomic burden due to its high prevalence. In China, particularly among populations living near desert regions, hypertension is even more prevalent due to unique environmental and lifestyle conditions, exacerbating the disease burden in these areas, underscoring the urgent need for effective early detection and intervention strategies.

Objective: This study aims to develop, calibrate, and prospectively validate a 2-year hypertension risk prediction model by using large-scale health examination data collected from populations residing in 4 regions surrounding the Taklamakan Desert of northwest China.

View Article and Find Full Text PDF

Importance: Frailty assessed at a single time point is associated with mortality in older women with breast cancer. Little is known about how changes in frailty following cancer treatment initiation affect mortality.

Objective: To evaluate the association between claims-based frailty trajectories following adjuvant chemotherapy initiation and 5-year mortality in older women with stage I to III breast cancer.

View Article and Find Full Text PDF

Validation of Diagnostic Utility of Washout CYFRA 21-1 in Lymph Node Metastasis of Thyroid Cancer.

Clin Cancer Res

March 2025

Seoul St. Mary's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Republic of Korea, Seoul, Korea (South), Republic of.

Purpose: Traditional methods, fine-needle aspiration cytology (FNAC) and washout thyroglobulin (Tg), do not always provide sufficient accuracy for diagnosing lymph node (LN) metastasis in thyroid cancer. This study aimed to validate the diagnostic performance of washout cytokeratin fragment 21-1 (CYFRA 21-1) as a complementary biomarker for diagnosing metastatic LNs in thyroid cancer and to explore its relationship with molecular analysis and distant metastasis.

Patients And Methods: In this retrospective cohort study involving 230 LNs in 224 patients with PTC, FNAC, washout Tg, and CYFRA 21-1 levels were measured in suspicious LNs.

View Article and Find Full Text PDF

Rationale: Quantifying functional small airways disease (fSAD) requires additional expiratory computed tomography (CT) scan, limiting clinical applicability. Artificial intelligence (AI) could enable fSAD quantification from chest CT scan at total lung capacity (TLC) alone (fSAD).

Objectives: To evaluate an AI model for estimating fSAD, compare it with dual-volume parametric response mapping fSAD (fSAD), and assess its clinical associations and repeatability in chronic obstructive pulmonary disease (COPD).

View Article and Find Full Text PDF

Background: The heterogeneous biology of cancer subtypes, especially in lung cancer, poses significant challenges for biomarker development. Standard model building techniques often fall short in accurately incorporating various histologic subtypes because of their diverse biological characteristics. This study explores a nested biomarker model to address this issue, aiming to improve lung cancer early detection.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!