Identification of medical conditions using claims data is generally conducted with algorithms based on subject-matter knowledge. However, these claims-based algorithms (CBAs) are highly dependent on the knowledge level and not necessarily optimized for target conditions. We investigated whether machine learning methods can supplement researchers' knowledge of target conditions in building CBAs. Retrospective cohort study using a claims database combined with annual health check-up results of employees' health insurance programs for fiscal year 2016-17 in Japan (study population for hypertension, N = 631,289; diabetes, N = 152,368; dyslipidemia, N = 614,434). We constructed CBAs with logistic regression, k-nearest neighbor, support vector machine, penalized logistic regression, tree-based model, and neural network for identifying patients with three common chronic conditions: hypertension, diabetes, and dyslipidemia. We then compared their association measures using a completely hold-out test set (25% of the study population). Among the test cohorts of 157,822, 38,092, and 153,608 enrollees for hypertension, diabetes, and dyslipidemia, 25.4%, 8.4%, and 38.7% of them had a diagnosis of the corresponding condition. The areas under the receiver operating characteristic curve (AUCs) of the logistic regression with/without subject-matter knowledge about the target condition were .923/.921 for hypertension, .957/.938 for diabetes, and .739/.747 for dyslipidemia. The logistic lasso, logistic elastic-net, and tree-based methods yielded AUCs comparable to those of the logistic regression with subject-matter knowledge: .923-.931 for hypertension; .958-.966 for diabetes; .747-.773 for dyslipidemia. We found that machine learning methods can attain AUCs comparable to the conventional knowledge-based method in building CBAs.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8476042PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0254394PLOS

Publication Analysis

Top Keywords

logistic regression
16
machine learning
12
learning methods
12
subject-matter knowledge
12
claims-based algorithms
8
common chronic
8
chronic conditions
8
target conditions
8
knowledge target
8
building cbas
8

Similar Publications

This study examined the interplay between physical workload, psychological stress, and the prevalence of work-related musculoskeletal disorders (WMSDs) among construction workers in Indonesia. This cross-sectional study used a purposive sampling technique to gather quantitative data from 409 respondents working in four construction companies through structured questionnaires. Data collection tools included the Copenhagen Psychosocial Questionnaire III (COPSOQ III), the K10 scale for psychosocial distress, and the Nordic Body Map for musculoskeletal symptoms.

View Article and Find Full Text PDF

Background: This study evaluated Health Care Workers' (HCWs) knowledge, attitude, perceived compliance, and potential influencing factors related to Infection Prevention and Control (IPC) standards in the North Bank East region of The Gambia.

Method: The study was an analytic cross-sectional study, conducted in 2021 using a multistage sampling technique. Thirteen health facilities were sampled from the North Bank East Region of The Gambia.

View Article and Find Full Text PDF

Background: The Clinical Frailty Scale (CFS) is a tool to assess the overall health of older adults. There are few reports of CFS and prognosis of ankle fracture. The objective of this study was to determine the predictive power of the CFS for adverse clinical and radiographic outcomes after surgery in elderly patients with trimalleolar fractures.

View Article and Find Full Text PDF

Background And Objectives: Metabolic syndrome (MetS) and its constituent comorbidities, along with mineral imbalances, pose a significant health burden in the Qatari population. Although Magnesium (Mg) and Calcium (Ca) have been individually linked to MetS, the impact of the calcium-to-magnesium ratio (Ca: Mg) on MetS remains unclear, especially in the adult population of Qatar. In this study, we aim to investigate the association between the total serum concentrations of Ca, Mg and Ca: Mg ratio with the outcome of MetS.

View Article and Find Full Text PDF

Background: Postoperative fever (POF) is a common occurrence in patients undergoing major surgery, presenting challenges and burdens for both patients and surgeons yet. This study endeavors to examine the incidence, identify risk factors, and establish a machine learning-based predictive model for POF following surgery of oral cancer.

Methods: A total of seven hundred and twenty-seven consecutive patients undergoing radical resection of oral cancer were retrospectively investigated.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!