Stochastic optimization of the Area Under the Precision-Recall Curve (AUPRC) is a crucial problem for machine learning. Despite extensive studies on AUPRC optimization, generalization is still an open problem. In this work, we present the first trial in the algorithm-dependent generalization of stochastic AUPRC optimization. The obstacles to our destination are three-fold. First, according to the consistency analysis, the majority of existing stochastic estimators are biased with biased sampling strategies. To address this issue, we propose a stochastic estimator with sampling-rate-invariant consistency and reduce the consistency error by estimating the full-batch scores with score memory. Second, standard techniques for algorithm-dependent generalization analysis cannot be directly applied to listwise losses. To fill this gap, we extend the model stability from instance-wise losses to listwise losses. Third, AUPRC optimization involves a compositional optimization problem, which brings complicated computations. In this work, we propose to reduce the computational complexity by matrix spectral decomposition. Based on these techniques, we derive the first algorithm-dependent generalization bound for AUPRC optimization. Motivated by theoretical results, we propose a generalization-induced learning framework, which improves the AUPRC generalization by equivalently increasing the batch size and the number of valid training examples. Practically, experiments on image retrieval and long-tailed classification speak to the effectiveness and soundness of our framework.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2024.3361861DOI Listing

Publication Analysis

Top Keywords

auprc optimization
20
algorithm-dependent generalization
16
listwise losses
8
auprc
7
optimization
7
generalization
5
algorithm-dependent
4
generalization auprc
4
optimization theory
4
theory algorithm
4

Similar Publications

Robust multi-modal fusion architecture for medical data with knowledge distillation.

Comput Methods Programs Biomed

December 2024

School of Biomedical Engineering, Capital Medical University, No.10, Xitoutiao, You An Men, Fengtai District, Beijing 100069, China; Beijing Key Laboratory of Fundamental Research on Biomechanics in Clinical Application, Capital Medical University, No.10, Xitoutiao, You An Men, Fengtai District, Beijing 100069, China. Electronic address:

Background: The fusion of multi-modal data has been shown to significantly enhance the performance of deep learning models, particularly on medical data. However, missing modalities are common in medical data due to patient specificity, which poses a substantial challenge to the application of these models.

Objective: This study aimed to develop a novel and efficient multi-modal fusion framework for medical datasets that maintains consistent performance, even in the absence of one or more modalities.

View Article and Find Full Text PDF

Background And Objective: Accurate prediction of perioperative major adverse cardiovascular events (MACEs) is crucial, as it not only aids clinicians in comprehensively assessing patients' surgical risks and tailoring personalized surgical and perioperative management plans, but also for information-based shared decision-making with patients and efficient allocation of medical resources. This study developed and validated a machine learning (ML) model using accessible preoperative clinical data to predict perioperative MACEs in stable coronary artery disease (SCAD) patients undergoing noncardiac surgery (NCS).

Methods: We collected data from 9171 adult SCAD patients who underwent NCS and extracted 64 preoperative variables.

View Article and Find Full Text PDF

Background: Distant metastasis of gastric cancer can seriously affect the treatment strategy of gastric cancer patients, so it is essential to identify patients at high risk of distant metastasis of gastric cancer earlier.

Method: In this study, we retrospectively collected research data from 18,472 gastric cancer patients from the SEER database. We applied six machine learning algorithms to construct a model that can predict distant metastasis of gastric cancer.

View Article and Find Full Text PDF

The array of complex and evolving patient data has limited clinical decision making in the emergency department (ED). This study introduces an advanced deep learning algorithm designed to enhance real-time prediction accuracy for integration into a novel Clinical Decision Support System (CDSS). A retrospective study was conducted using data from a level 1 tertiary hospital.

View Article and Find Full Text PDF

Metrics for evaluation of automatic epileptogenic zone localization in intracranial electrophysiology.

Clin Neurophysiol

January 2025

Montreal Neurological Hospital, McGill University, 3801 Rue University, Montreal, QC H3A 2B4, Quebec, Canada; Department of Neurology, Duke University Medical School and Department of Biomedical Engineering, Pratt School of Engineering, 2424 Erwin Road, Durham, NC 27705, the United States of America. Electronic address:

Introduction: Precise localization of the epileptogenic zone is critical for successful epilepsy surgery. However, imbalanced datasets in terms of epileptic vs. normal electrode contacts and a lack of standardized evaluation guidelines hinder the consistent evaluation of automatic machine learning localization models.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!