A too-good-to-be-true prior to reduce shortcut reliance.

Nikolay Dagaev Brett D Roads Xiaoliang Luo Daniel N Barry Kaustubh R Patil Bradley C Love

Pattern Recognit Lett

Department of Experimental Psychology, University College London, London, United Kingdom.

Published: February 2023

Despite their impressive performance in object recognition and other tasks under standard testing conditions, deep networks often fail to generalize to out-of-distribution (o.o.d.) samples. One cause for this shortcoming is that modern architectures tend to rely on ǣshortcutsǥ superficial features that correlate with categories without capturing deeper invariants that hold across contexts. Real-world concepts often possess a complex structure that can vary superficially across contexts, which can make the most intuitive and promising solutions in one context not generalize to others. One potential way to improve o.o.d. generalization is to assume simple solutions are unlikely to be valid across contexts and avoid them, which we refer to as the . A low-capacity network (LCN) with a shallow architecture should only be able to learn surface relationships, including shortcuts. We find that LCNs can serve as shortcut detectors. Furthermore, an LCN's predictions can be used in a two-stage approach to encourage a high-capacity network (HCN) to rely on deeper invariant features that should generalize broadly. In particular, items that the LCN can master are downweighted when training the HCN. Using a modified version of the CIFAR-10 dataset in which we introduced shortcuts, we found that the two-stage LCN-HCN approach reduced reliance on shortcuts and facilitated o.o.d. generalization.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10615835	PMC
http://dx.doi.org/10.1016/j.patrec.2022.12.010	DOI Listing

Publication Analysis

Top Keywords

ood generalization

too-good-to-be-true prior

prior reduce

reduce shortcut

shortcut reliance

reliance despite

despite impressive

impressive performance

performance object

object recognition

Similar Publications

Out-of-distribution generalization for segmentation of lymph node metastasis in breast cancer.

Sci Rep

January 2025

Department of Electrical, Computer, and Biomedical Engineering, Toronto Metropolitan University, Toronto, ON, Canada.

Yiannis Varnava Kiran Jakate Richard Garnett Dimitrios Androutsos Pascal N Tyrrell

Pathology provides the definitive diagnosis, and Artificial Intelligence (AI) tools are poised to improve accuracy, inter-rater agreement, and turn-around time (TAT) of pathologists, leading to improved quality of care. A high value clinical application is the grading of Lymph Node Metastasis (LNM) which is used for breast cancer staging and guides treatment decisions. A challenge of implementing AI tools widely for LNM classification is domain shift, where Out-of-Distribution (OOD) data has a different distribution than the In-Distribution (ID) data used to train the model, resulting in a drop in performance in OOD data.

View Article and Find Full Text PDF

Similar Publications

Exploring transition states of protein conformational changes via out-of-distribution detection in the hyperspherical latent space.

Nat Commun

January 2025

Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI, 53706, USA.

Bojun Liu Jordan G Boysen Ilona Christy Unarta Xuefeng Du Yixuan Li

Identifying transitional states is crucial for understanding protein conformational changes that underlie numerous biological processes. Markov state models (MSMs), built from Molecular Dynamics (MD) simulations, capture these dynamics through transitions among metastable conformational states, and have demonstrated success in studying protein conformational changes. However, MSMs face challenges in identifying transition states, as they partition MD conformations into discrete metastable states (or free energy minima), lacking description of transition states located at the free energy barriers.

View Article and Find Full Text PDF

Similar Publications

A calibration framework toward model generalization for bacteria concentration estimation in water resource recovery facilities.

Sci Rep

December 2024

National Institute for Research in Digital Science and Technology (INRIA), Paris-Saclay, France.

Fahad Aljehani Ibrahima N'Doye Pei-Ying Hong Mohammad Khalil Monjed Taous-Meriem Laleg-Kirati

Reduced bacteria concentrations in wastewater is a key indicator of the efficacy of water resource recovery facilities (WRRFs). However, monitoring the presence of bacterial concentrations in real time at each stage of the WRRF is challenging as it requires taking and processing water samples offline. Although few studies have been proposed to predict bacterial concentrations using data-driven models, generalizing these models to unseen data from different WRRFs remains challenging.

View Article and Find Full Text PDF

Similar Publications

Unifying invariant and variant features for graph out-of-distribution via probability of necessity and sufficiency.

Neural Netw

December 2024

College of Science, Shantou University, Shantou 515063, China. Electronic address:

Xuexin Chen Ruichu Cai Kaitao Zheng Zhifan Jiang Zhengting Huang

Graph Out-of-Distribution (OOD), requiring that models trained on biased data generalize to the unseen test data, has considerable real-world applications. One of the most mainstream methods is to extract the invariant subgraph by aligning the original and augmented data with the help of environment augmentation. However, these solutions might lead to the loss or redundancy of semantic subgraphs and result in suboptimal generalization.

View Article and Find Full Text PDF

Similar Publications

Unmasking the chameleons: A benchmark for out-of-distribution detection in medical tabular data.

Int J Med Inform

December 2024

Department of Medical Informatics, Amsterdam Public Health Research Institute, Amsterdam UMC, University of Amsterdam, the Netherlands; Institute of Logic, Language and Computation, University of Amsterdam, the Netherlands; Pacmed, Amsterdam, the Netherlands. Electronic address:

Mohammad Azizmalayeri Ameen Abu-Hanna Giovanni Cinà

Background: Machine Learning (ML) models often struggle to generalize effectively to data that deviates from the training distribution. This raises significant concerns about the reliability of real-world healthcare systems encountering such inputs known as out-of-distribution (OOD) data. These concerns can be addressed by real-time detection of OOD inputs.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!