AI Article Synopsis

  • The study assessed a machine learning approach for estimating low-density lipoprotein cholesterol (LDL-C) and how different training datasets affected its accuracy.
  • Three distinct training datasets were used - health check-up participants, clinical patients from Gifu University Hospital, and clinical patients from Fujita Health University Hospital, with nine machine learning models developed and evaluated for performance.
  • Results indicated that models trained on clinical patient data performed better than those trained on health check-up data, emphasizing the need for matching dataset characteristics when training machine learning models for accurate LDL-C estimation.

Article Abstract

Objectives: We evaluated the applicability of a machine learning-based low-density lipoprotein-cholesterol (LDL-C) estimation method and the influence of the characteristics of the training datasets.

Methods: Three training datasets were chosen from training datasets: health check-up participants at the Resource Center for Health Science ( = 2664), clinical patients at Gifu University Hospital ( = 7409), and clinical patients at Fujita Health University Hospital ( = 14,842). Nine different machine learning models were constructed through hyperparameter tuning and 10-fold cross-validation. Another test dataset of another 3711 clinical patients at Fujita Health University Hospital was selected as the test set used for comparing and validating the model against the Friedewald formula and the Martin method.

Results: The coefficients of determination of the models trained on the health check-up dataset produced coefficients of determination that were equal to or inferior to those of the Martin method. In contrast, the coefficients of determination of several models trained on clinical patients exceeded those of the Martin method. The means of the differences and the convergences to the direct method were higher for the models trained on the clinical patients' dataset than for those trained on the health check-up participants' dataset. The models trained on the latter dataset tended to overestimate the 2019 ESC/EAS Guideline for LDL-cholesterol classification.

Conclusion: Although machine learning models provide valuable method for LDL-C estimates, they should be trained on datasets with matched characteristics. The versatility of machine learning methods is another important consideration.

Download full-text PDF

Source
http://dx.doi.org/10.1177/00045632231180408DOI Listing

Publication Analysis

Top Keywords

machine learning
16
clinical patients
16
models trained
16
health check-up
12
university hospital
12
coefficients determination
12
low-density lipoprotein-cholesterol
8
training datasets
8
patients fujita
8
fujita health
8

Similar Publications

Automated Classification of Cardiac Arrhythmia using Short-Duration ECG Signals and Machine Learning.

Biomed Phys Eng Express

January 2025

Electronics and Communication Engineering, Rajiv Gandhi University, Rono Hills, Doimukh, ITANAGAR, Itanagar, Arunachal Pradesh, 791112, INDIA.

Accurate detection of cardiac arrhythmias is crucial for preventing premature deaths. The current study employs a dual-stage Discrete Wavelet Transform (DWT) and a median filter to eliminate noise from ECG signals. Subsequently, ECG signals are segmented, and QRS regions are extracted for further preprocessing.

View Article and Find Full Text PDF

Eutrophication is one of the most relevant concerns due to the risk to water supply and food security. Nitrogen and phosphorus chemical species concentrations determined the risk and magnitude of eutrophication. These analyses are even more relevant in basins with intensive agriculture due to agrochemical discharges.

View Article and Find Full Text PDF

Automated and Efficient Sampling of Chemical Reaction Space.

Adv Sci (Weinh)

January 2025

Department of Chemistry, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea.

Machine learning interatomic potentials (MLIPs) promise quantum-level accuracy at classical force field speeds, but their performance hinges on the quality and diversity of training data. An efficient and fully automated approach to sample chemical reaction space without relying on human intuition, addressing a critical gap in MLIP development is presented. The method combines the speed of tight-binding calculations with selective high-level refinement, generating diverse datasets that capture both equilibrium and reactive regions of potential energy surfaces.

View Article and Find Full Text PDF

The aim of this study is to address the limitations of convolutional networks in recognizing modulation patterns. These networks are unable to utilize temporal information effectively for feature extraction and modulation pattern recognition, resulting in inefficient modulation pattern recognition. To address this issue, a signal modulation recognition method based on a two-way interactive temporal attention network algorithm has been developed.

View Article and Find Full Text PDF

Soil spectroscopy is a widely used method for estimating soil properties that are important to environmental and agricultural monitoring. However, a bottleneck to its more widespread adoption is the need for establishing large reference datasets for training machine learning (ML) models, which are called soil spectral libraries (SSLs). Similarly, the prediction capacity of new samples is also subject to the number and diversity of soil types and conditions represented in the SSLs.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!