Conversion of adverse data corpus to shrewd output using sampling metrics.

Shahzad Ashraf Sehrish Saleem Tauqeer Ahmed Zeeshan Aslam Durr Muhammad

Vis Comput Ind Biomed Art

Pakistan Steel Mills Karachi, Karachi, 75200, Pakistan.

Published: August 2020

An imbalanced dataset is commonly found in at least one class, which are typically exceeded by the other ones. A machine learning algorithm (classifier) trained with an imbalanced dataset predicts the majority class (frequently occurring) more than the other minority classes (rarely occurring). Training with an imbalanced dataset poses challenges for classifiers; however, applying suitable techniques for reducing class imbalance issues can enhance classifiers' performance. In this study, we consider an imbalanced dataset from an educational context. Initially, we examine all shortcomings regarding the classification of an imbalanced dataset. Then, we apply data-level algorithms for class balancing and compare the performance of classifiers. The performance of the classifiers is measured using the underlying information in their confusion matrices, such as accuracy, precision, recall, and F measure. The results show that classification with an imbalanced dataset may produce high accuracy but low precision and recall for the minority class. The analysis confirms that undersampling and oversampling are effective for balancing datasets, but the latter dominates.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7417470	PMC
http://dx.doi.org/10.1186/s42492-020-00055-9	DOI Listing

Publication Analysis

Top Keywords

imbalanced dataset

classification imbalanced

performance classifiers

precision recall

imbalanced

dataset

class

conversion adverse

adverse data

data corpus

Similar Publications

A graph neural network-based model with out-of-distribution robustness for enhancing antiretroviral therapy outcome prediction for HIV-1.

Comput Med Imaging Graph

January 2025

Sapienza University of Rome, Department of Computer Control and Management Engineering Antonio Ruberti, 00185, Rome, Italy. Electronic address:

Giulia Di Teodoro Federico Siciliano Valerio Guarrasi Anne-Mieke Vandamme Valeria Ghisetti

Predicting the outcome of antiretroviral therapies (ART) for HIV-1 is a pressing clinical challenge, especially when the ART includes drugs with limited effectiveness data. This scarcity of data can arise either due to the introduction of a new drug to the market or due to limited use in clinical settings, resulting in clinical dataset with highly unbalanced therapy representation. To tackle this issue, we introduce a novel joint fusion model, which combines features from a Fully Connected (FC) Neural Network and a Graph Neural Network (GNN) in a multi-modality fashion.

View Article and Find Full Text PDF

Similar Publications

An Intelligent System for Classifying Patient Complaints Using Machine Learning and Natural Language Processing: Development and Validation Study.

J Med Internet Res

January 2025

Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center For Child Health, Hang Zhou, China.

Xiadong Li Qiang Shu Canhong Kong Jinhu Wang Gang Li

Background: Accurate classification of patient complaints is crucial for enhancing patient satisfaction management in health care settings. Traditional manual methods for categorizing complaints often lack efficiency and precision. Thus, there is a growing demand for advanced and automated approaches to streamline the classification process.

View Article and Find Full Text PDF

Similar Publications

Unsupervised Learning for Machinery Adaptive Fault Detection Using Wide-Deep Convolutional Autoencoder with Kernelized Attention Mechanism.

Sensors (Basel)

December 2024

State Key Laboratory of Digital Manufacturing Equipment and Technology, Huazhong University of Science and Technology, Wuhan 430074, China.

Hao Yan Xiangfeng Si Jianqiang Liang Jian Duan Tielin Shi

Applying deep learning to unsupervised bearing fault diagnosis in complex industrial environments is challenging. Traditional fault detection methods rely on labeled data, which is costly and labor-intensive to obtain. This paper proposes a novel unsupervised approach, WDCAE-LKA, combining a wide kernel convolutional autoencoder (WDCAE) with a large kernel attention (LKA) mechanism to improve fault detection under unlabeled conditions, and the adaptive threshold module based on a multi-layer perceptron (MLP) dynamically adjusts thresholds, boosting model robustness in imbalanced scenarios.

View Article and Find Full Text PDF

Similar Publications

Denoising Diffusion Implicit Model Combined with TransNet for Rolling Bearing Fault Diagnosis Under Imbalanced Data.

Sensors (Basel)

December 2024

School of Mechatronics and Vehicle Engineering, East China Jiaotong University, Nanchang 330013, China.

Chaobing Wang Cong Huang Long Zhang Zhibin Xiang Yiwen Xiao

Data imbalances present a serious problem for intelligent fault diagnosis. They can lead to reduced diagnostic precision, which can jeopardize equipment reliability and safety. Based on that, this paper proposes a novel fault diagnosis method combining the denoising diffusion implicit model (DDIM) with a new convolutional neural network framework.

View Article and Find Full Text PDF

Similar Publications

Predictive analytics technique based on hybrid sampling to manage unbalanced data in smart cities.

Heliyon

December 2024

Computer Science & Engineering Department, University Institute of Technology, Rajiv Gandhi Proudyogiki Vishwavidyalaya (Technological University of Madhya Pradesh), Bhopal, Madhya Pradesh, India.

Ayushi Chahal Preeti Gulia Nasib Singh Gill Mohammad Yahya Mohd Anul Haq

A smart city is deemed smart enough because it has the capability to make decisions on its own. Artificial intelligence needs a lot of data from the physical world to make correct decisions. IoT sensor devices collect data from the surroundings, which is further used for predictive analytics.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!