In spite of machine learning has been successfully used in a wide range of healthcare applications, there are several parameters that could influence the performance of a machine learning system. One of the big issues for a machine learning algorithm is related to imbalanced dataset. An imbalanced dataset occurs when the distribution of data is not uniform. This makes harder the implementation of accurate models. In this paper, intelligent models are implemented to predict the hematocrit level of blood starting from visible spectral data. The aim of this work is to show the effects of two balancing techniques (SMOTE and SMOTE+ENN) on the imbalanced dataset of blood spectra. Four different machine learning systems are fitted with imbalanced and balanced datasets and their performances are compared showing an improvement, in terms of accuracy, due to the use of balancing.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8024026PMC
http://dx.doi.org/10.1049/htl2.12006DOI Listing

Publication Analysis

Top Keywords

imbalanced dataset
16
machine learning
16
dataset blood
8
blood spectra
8
imbalanced
5
prediction hematocrit
4
hematocrit imbalanced
4
dataset
4
spectra spite
4
machine
4

Similar Publications

Marine pollution due to oil spills presents major risks to coastal areas and aquatic life, leading to serious environmental health concerns. Oil Spill detection using SAR data has transitioned from traditional segmentation to a variety of machine learning & deep learning models like UNET proving its efficiency for the task. This research paper proposes a GSCAT-UNET model for efficient oil spill detection and discrimination from lookalikes.

View Article and Find Full Text PDF

In the Imbalanced Multivariate Time Series Classification (ImMTSC) task, minority-class instances typically correspond to critical events, such as system faults in power grids or abnormal health occurrences in medical monitoring. Despite being rare and random, these events are highly significant. The dynamic spatial-temporal relationships between minority-class instances and other instances make them more prone to interference from neighboring instances during classification.

View Article and Find Full Text PDF

Malaria remains a global health concern, with 249 million cases and 608,000 deaths being reported by the WHO in 2022. Traditional diagnostic methods often struggle with inconsistent stain quality, lighting variations, and limited resources in endemic regions, making manual detection time-intensive and error-prone. This study introduces an automated system for analyzing Romanowsky-stained thick blood smears, focusing on image quality evaluation, leukocyte detection, and malaria parasite classification.

View Article and Find Full Text PDF

Mold defects pose a significant risk to the preservation of valuable fine art paintings, typically arising from fungal growth in humid environments. This paper presents a novel approach for detecting and categorizing mold defects in fine art paintings. The technique leverages a feature extraction method called Derivative Level Thresholding to pinpoint suspicious regions within an image.

View Article and Find Full Text PDF

Diabetes causes an increase in the level of blood sugar, which leads to damage to various parts of the human body. Diabetes data are used not only for providing a deeper understanding of the treatment mechanisms but also for predicting the probability that one might become sick. This paper proposes a novel methodology to perform classification in the case of heavy class imbalance, as observed in the PIMA diabetes dataset.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!