As a crucial step toward real-world learning scenarios with changing environments, dataset shift theory and invariant representation learning algorithm have been extensively studied to relax the identical distribution assumption in classical learning setting. Among the different assumptions on the essential of shifting distributions, generalized label shift (GLS) is the latest developed one which shows great potential to deal with the complex factors within the shift. In this paper, we aim to explore the limitations of current dataset shift theory and algorithm, and further provide new insights by presenting a comprehensive understanding of GLS. From theoretical aspect, two informative generalization bounds are derived, and the GLS learner are proved to be sufficiently close to optimal target model from the Bayesian perspective. The main results show the insufficiency of invariant representation learning, and prove the sufficiency and necessity of GLS correction for generalization, which provide theoretical supports and innovations for exploring generalizable model under dataset shift. From methodological aspect, we provide a unified view of existing shift correction frameworks, and propose a kernel embedding-based correction algorithm (KECA) to minimize the generalization error and achieve successful knowledge transfer. Both theoretical results and extensive experiment evaluations demonstrate the sufficiency and necessity of GLS correction for addressing dataset shift and the superiority of proposed algorithm.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2024.3417214DOI Listing

Publication Analysis

Top Keywords

dataset shift
16
invariant representation
12
representation learning
12
shift
8
label shift
8
shift theory
8
sufficiency necessity
8
necessity gls
8
gls correction
8
learning
5

Similar Publications

Delirium Management Quality Improvement Project to Improve Awareness and Screening in a Medical ICU.

Nurs Rep

December 2024

Division of Pulmonary, Critical Care and Sleep Medicine, UC San Diego Health, La Jolla, CA 92093, USA.

Although delirium is common during critical illness, standard-of-care detection and prevention practices in real-world intensive care unit (ICU) settings remain inconsistent, often due to a lack of provider education. Despite availability for over 20 years of validated delirium screening tools such as the Confusion Assessment Method in the ICU (CAM-ICU), feasible and rigorous educational efforts continue to be needed to address persistent delirium standard-of-care practice gaps. Spanning an 8-month quality improvement project period, our single-ICU interdisciplinary effort involved delivery of CAM-ICU pocket cards to bedside nurses, and lectures by experienced champions that included a live delirium detection demonstration using the CAM-ICU, and a comprehensive discussion of evidence-based delirium prevention strategies (e.

View Article and Find Full Text PDF

Impeding linear calibration models from accurately predicting target sample analyte amounts are the target sample-wise deviations in measurement profiles (e.g., spectra) relative to calibration samples.

View Article and Find Full Text PDF

Is digital-green synergy the future of carbon emission performance?

J Environ Manage

January 2025

School of Economics and Management, China University of Geosciences, Wuhan, 430078, China. Electronic address:

Amid the new industrial revolution, digital technology and green finance play pivotal roles in shifting towards a low-carbon economy. This paper establishes a coherent research framework by integrating digital technology, green financing, and carbon emission performance. Utilizing a multifaceted dataset that combines provincial panel data with corporate listings databases, this study evaluates the development of green finance using a hybrid weighing methodology that merges the analytical hierarchy process (AHP) with the spatial-temporal entropy weight method.

View Article and Find Full Text PDF

In the field of agriculture, particularly within the context of machine learning applications, quality datasets are essential for advancing research and development. To address the challenges of identifying different mango leaf types and recognizing the diverse and unique characteristics of mango varieties in Bangladesh, a comprehensive and publicly accessible dataset titled "BDMANGO" has been created. This dataset includes images essential for research, featuring six mango varieties: Amrapali, Banana, Chaunsa, Fazli, Haribhanga, and Himsagar, which were collected from different locations.

View Article and Find Full Text PDF

Background: South Korea has witnessed a rapid increase in health expenditure, reaching USD 135 billion in 2021 and accounting for 9.3% of its GDP, surpassing the OECD average. Despite achieving universal health coverage, significant gaps remain in service coverage, leading to high out-of-pocket (OOP) expenses that expose households to financial burdens.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!