Machine Learning Methods for the Diagnosis of Chronic Obstructive Pulmonary Disease in Healthy Subjects: Retrospective Observational Cohort Study.

JMIR Med Inform

Division of Environmental Medicine and Population Sciences, Department of Social and Environmental Medicine, Graduate School of Medicine, Osaka University, Osaka, Japan.

Published: July 2021

Background: Airflow limitation is a critical physiological feature in chronic obstructive pulmonary disease (COPD), for which long-term exposure to noxious substances, including tobacco smoke, is an established risk. However, not all long-term smokers develop COPD, meaning that other risk factors exist.

Objective: This study aimed to predict the risk factors for COPD diagnosis using machine learning in an annual medical check-up database.

Methods: In this retrospective observational cohort study (ARTDECO [Analysis of Risk Factors to Detect COPD]), annual medical check-up records for all Hitachi Ltd employees in Japan collected from April 1998 to March 2019 were analyzed. Employees who provided informed consent via an opt-out model were screened and those aged 30 to 75 years without a prior diagnosis of COPD/asthma or a history of cancer were included. The database included clinical measurements (eg, pulmonary function tests) and questionnaire responses. To predict the risk factors for COPD diagnosis within a 3-year period, the Gradient Boosting Decision Tree machine learning (XGBoost) method was applied as a primary approach, with logistic regression as a secondary method. A diagnosis of COPD was made when the ratio of the prebronchodilator forced expiratory volume in 1 second (FEV) to prebronchodilator forced vital capacity (FVC) was <0.7 during two consecutive examinations.

Results: Of the 26,101 individuals screened, 1213 met the exclusion criteria, and thus, 24,815 individuals were included in the analysis. The top 10 predictors for COPD diagnosis were FEV/FVC, smoking status, allergic symptoms, cough, pack years, hemoglobin A, serum albumin, mean corpuscular volume, percent predicted vital capacity, and percent predicted value of FEV. The areas under the receiver operating characteristic curves of the XGBoost model and the logistic regression model were 0.956 and 0.943, respectively.

Conclusions: Using a machine learning model in this longitudinal database, we identified a number of parameters as risk factors other than smoking exposure or lung function to support general practitioners and occupational health physicians to predict the development of COPD. Further research to confirm our results is warranted, as our analysis involved a database used only in Japan.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8293159PMC
http://dx.doi.org/10.2196/24796DOI Listing

Publication Analysis

Top Keywords

risk factors
16
machine learning
12
chronic obstructive
8
obstructive pulmonary
8
pulmonary disease
8
retrospective observational
8
observational cohort
8
cohort study
8
predict risk
8
factors copd
8

Similar Publications

Enteropathogens are major contributors to mortality and morbidity, particularly in settings with limited access to water, sanitation, and hygiene infrastructure. To assess transmission pathways associated with enteropathogen infection, we measured household environmental conditions and assayed 22 enteropathogens using TaqMan Array Cards in stool samples from 276 six-month-old children living in communities along a rural-urban gradient in Northern Ecuador. We utilized multivariable models, risk factor importance, and distance-based statistical methods to test factors associated with infection.

View Article and Find Full Text PDF

Objectives This study analyzed the practices and findings on postpartum type 2 diabetes mellitus (T2DM) screening among pregnant women with gestational diabetes mellitus (GDM). Methods A retrospective cohort study was conducted at a tertiary care center in Western Saudi Arabia, between January 1, 2016, and December 31, 2018. It involved 642 nondiabetic women with a confirmed diagnosis of GDM, who were followed until delivery.

View Article and Find Full Text PDF

Chest pain in a multi-ethnic population: A community-based study on sex differences in chest pain prevalence and care contacts.

Int J Cardiol Cardiovasc Risk Prev

March 2025

Department of Public and Occupational Health, Amsterdam UMC, Location AMC, Amsterdam, the Netherlands.

Objective: While chest pain is a common symptom, its prevalence among women and men across ethnic groups is unknown. Moreover, how chest pain is associated with general practitioner (GP) and cardiologist visits in women and men across ethnic groups, remains to be determined.

Design: We used baseline data on 12423 women and 9071 men from the multi-ethnic HELIUS cohort (Amsterdam, the Netherlands; 2011-2015).

View Article and Find Full Text PDF

Background And Aims: Atherosclerotic cardiovascular disease (ASCVD) is a leading cause of mortality, and while the association between the urinary albumin-to-creatinine ratio (UACR) and cardiovascular risk is recognized, the specific impact of UACR on the long-term survival of ASCVD patients remains not fully understood. The aim of this study is to investigate the influence of UACR on the long-term risk of all-cause mortality in patients with ASCVD.

Methods: This study included ASCVD patients from the National Health and Nutrition Examination Survey (NHANES) from 1999 to 2018.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!