Despite their impressive performance in object recognition and other tasks under standard testing conditions, deep networks often fail to generalize to out-of-distribution (o.o.d.) samples. One cause for this shortcoming is that modern architectures tend to rely on ǣshortcutsǥ superficial features that correlate with categories without capturing deeper invariants that hold across contexts. Real-world concepts often possess a complex structure that can vary superficially across contexts, which can make the most intuitive and promising solutions in one context not generalize to others. One potential way to improve o.o.d. generalization is to assume simple solutions are unlikely to be valid across contexts and avoid them, which we refer to as the . A low-capacity network (LCN) with a shallow architecture should only be able to learn surface relationships, including shortcuts. We find that LCNs can serve as shortcut detectors. Furthermore, an LCN's predictions can be used in a two-stage approach to encourage a high-capacity network (HCN) to rely on deeper invariant features that should generalize broadly. In particular, items that the LCN can master are downweighted when training the HCN. Using a modified version of the CIFAR-10 dataset in which we introduced shortcuts, we found that the two-stage LCN-HCN approach reduced reliance on shortcuts and facilitated o.o.d. generalization.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10615835 | PMC |
http://dx.doi.org/10.1016/j.patrec.2022.12.010 | DOI Listing |
Sci Rep
January 2025
Department of Electrical, Computer, and Biomedical Engineering, Toronto Metropolitan University, Toronto, ON, Canada.
Pathology provides the definitive diagnosis, and Artificial Intelligence (AI) tools are poised to improve accuracy, inter-rater agreement, and turn-around time (TAT) of pathologists, leading to improved quality of care. A high value clinical application is the grading of Lymph Node Metastasis (LNM) which is used for breast cancer staging and guides treatment decisions. A challenge of implementing AI tools widely for LNM classification is domain shift, where Out-of-Distribution (OOD) data has a different distribution than the In-Distribution (ID) data used to train the model, resulting in a drop in performance in OOD data.
View Article and Find Full Text PDFNat Commun
January 2025
Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI, 53706, USA.
Identifying transitional states is crucial for understanding protein conformational changes that underlie numerous biological processes. Markov state models (MSMs), built from Molecular Dynamics (MD) simulations, capture these dynamics through transitions among metastable conformational states, and have demonstrated success in studying protein conformational changes. However, MSMs face challenges in identifying transition states, as they partition MD conformations into discrete metastable states (or free energy minima), lacking description of transition states located at the free energy barriers.
View Article and Find Full Text PDFReduced bacteria concentrations in wastewater is a key indicator of the efficacy of water resource recovery facilities (WRRFs). However, monitoring the presence of bacterial concentrations in real time at each stage of the WRRF is challenging as it requires taking and processing water samples offline. Although few studies have been proposed to predict bacterial concentrations using data-driven models, generalizing these models to unseen data from different WRRFs remains challenging.
View Article and Find Full Text PDFNeural Netw
December 2024
College of Science, Shantou University, Shantou 515063, China. Electronic address:
Graph Out-of-Distribution (OOD), requiring that models trained on biased data generalize to the unseen test data, has considerable real-world applications. One of the most mainstream methods is to extract the invariant subgraph by aligning the original and augmented data with the help of environment augmentation. However, these solutions might lead to the loss or redundancy of semantic subgraphs and result in suboptimal generalization.
View Article and Find Full Text PDFInt J Med Inform
December 2024
Department of Medical Informatics, Amsterdam Public Health Research Institute, Amsterdam UMC, University of Amsterdam, the Netherlands; Institute of Logic, Language and Computation, University of Amsterdam, the Netherlands; Pacmed, Amsterdam, the Netherlands. Electronic address:
Background: Machine Learning (ML) models often struggle to generalize effectively to data that deviates from the training distribution. This raises significant concerns about the reliability of real-world healthcare systems encountering such inputs known as out-of-distribution (OOD) data. These concerns can be addressed by real-time detection of OOD inputs.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!