Imbalanced data situations exist in most fields of endeavor. The problem has been identified as a major bottleneck in machine learning/data mining and is becoming a serious issue of concern in food processing applications. Inappropriate analysis of agricultural and food processing data was identified as limiting the robustness of predictive models built from agri-food applications. As a result of rare cases occurring infrequently, classification rules that detect small groups are scarce, so samples belonging to small classes are largely misclassified. Most existing machine learning algorithms including the K-means, decision trees, and support vector machines (SVMs) are not optimal in handling imbalanced data. Consequently, models developed from the analysis of such data are very prone to rejection and non-adoptability in real industrial and commercial settings. This paper showcases the reality of the imbalanced data problem in agri-food applications and therefore proposes some state-of-the-art artificial intelligence algorithm approaches for handling the problem using methods including data resampling, one-class learning, ensemble methods, feature selection, and deep learning techniques. This paper further evaluates existing and newer metrics that are well suited for handling imbalanced data. Rightly analyzing imbalanced data from food processing application research works will improve the accuracy of results and model developments. This will consequently enhance the acceptability and adoptability of innovations/inventions.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11507408 | PMC |
http://dx.doi.org/10.3390/foods13203300 | DOI Listing |
Front Oncol
January 2025
State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, China.
Objective: The current clinical data regarding the re-administration of PARPi maintenance therapy in platinum sensitive recurrent ovarian cancer (PSROC) is limited. This study aims to investigate the efficacy and associated factors of PARPi re-maintenance therapy in PSROC patients in China.
Methods: In this study, there were 201 patients with PSROC who had received maintenance therapy previously and achieved complete or partial response after platinum-based chemotherapy upon recurrence.
Lung cancer is a leading cause of cancer-related mortality, with disparities in incidence and outcomes observed across different racial and sex groups. Understanding the genetic factors of these disparities is critical for developing targeted treatment therapies. This study aims to identify both patient-specific and cohort-specific biomarker genes that contribute to lung cancer health disparities among African American males (AAMs), European American males (EAMs), African American females (AAFs), and European American females (EAFs).
View Article and Find Full Text PDFMethodsX
June 2025
Department of Networking & Communications, School of Computing, SRM Institute of Science and Technology, Kattankulathur, Chennai, India.
Forecasting student performance with precision in the educational space is paramount for creating tailor-made interventions capable to boost learning effectiveness. It means most of the traditional student performance prediction models have difficulty in dealing with multi-dimensional academic data, can cause sub-optimal classification and generate a simple generalized insight. To address these challenges of the existing system, in this research we propose a new model Multi-dimensional Student Performance Prediction Model (MSPP) that is inspired by advanced data preprocessing and feature engineering techniques using deep learning.
View Article and Find Full Text PDFMar Pollut Bull
January 2025
JK Laxmipat University, Jaipur, Rajasthan, India.
Marine pollution due to oil spills presents major risks to coastal areas and aquatic life, leading to serious environmental health concerns. Oil Spill detection using SAR data has transitioned from traditional segmentation to a variety of machine learning & deep learning models like UNET proving its efficiency for the task. This research paper proposes a GSCAT-UNET model for efficient oil spill detection and discrimination from lookalikes.
View Article and Find Full Text PDFNeural Netw
January 2025
School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430070, Hubei, China.
In the Imbalanced Multivariate Time Series Classification (ImMTSC) task, minority-class instances typically correspond to critical events, such as system faults in power grids or abnormal health occurrences in medical monitoring. Despite being rare and random, these events are highly significant. The dynamic spatial-temporal relationships between minority-class instances and other instances make them more prone to interference from neighboring instances during classification.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!