Though significant progress has been achieved on fine-grained visual classification (FGVC), severe overfitting still hinders model generalization. A recent study shows that hard samples in the training set can be easily fit, but most existing FGVC methods fail to classify some hard examples in the test set. The reason is that the model overfits those hard examples in the training set, but does not learn to generalize to unseen examples in the test set. In this article, we propose a moderate hard example modulation (MHEM) strategy to properly modulate the hard examples. MHEM encourages the model to not overfit hard examples and offers better generalization and discrimination. First, we introduce three conditions and formulate a general form of a modulated loss function. Second, we instantiate the loss function and provide a strong baseline for FGVC, where the performance of a naive backbone can be boosted and be comparable with recent methods. Moreover, we demonstrate that our baseline can be readily incorporated into the existing methods and empower these methods to be more discriminative. Equipped with our strong baseline, we achieve consistent improvements on three typical FGVC datasets, i.e., CUB-200-2011, Stanford Cars, and FGVC-Aircraft. We hope the idea of moderate hard example modulation will inspire future research work toward more effective fine-grained visual recognition.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TNNLS.2022.3213563DOI Listing

Publication Analysis

Top Keywords

hard examples
16
hard example
12
strong baseline
12
fine-grained visual
12
visual classification
8
training set
8
examples test
8
test set
8
moderate hard
8
example modulation
8

Similar Publications

Ever since de Saussure [Course in General Lingustics (Columbia University Press, 1916)], theorists of language have assumed that the relation between form and meaning of words is arbitrary. However, recently, a body of empirical research has established that language is embodied and contains iconicity. Sound symbolism, an intrinsic link language users perceive between word sound and properties of referents, is a representative example of iconicity in language and has offered profound insights into theories of language pertaining to language processing, language acquisition, and evolution.

View Article and Find Full Text PDF

Towards Context-Rich Automated Biodiversity Assessments: Deriving AI-Powered Insights from Camera Trap Data.

Sensors (Basel)

December 2024

School of Biological and Environmental Sciences, Liverpool John Moores University, James Parsons Building, Byrom Street, Liverpool L3 3AF, UK.

Camera traps offer enormous new opportunities in ecological studies, but current automated image analysis methods often lack the contextual richness needed to support impactful conservation outcomes. Integrating vision-language models into these workflows could address this gap by providing enhanced contextual understanding and enabling advanced queries across temporal and spatial dimensions. Here, we present an integrated approach that combines deep learning-based vision and language models to improve ecological reporting using data from camera traps.

View Article and Find Full Text PDF

Due to limited slip systems activated at room temperature, the plastic deformation of Mg and its alloys without any preheating of initial billets is significantly limited. To overcome those issues, new methods of severe plastic deformation are discovered and developed. One such example is extrusion with an oscillating die, called KoBo.

View Article and Find Full Text PDF

Heart failure (HF) remains a major cause of morbidity and mortality worldwide. While standard treatments primarily target neurohormonal pathways, emerging evidence highlights the significant role of hormonal deficiencies, such as impaired growth hormone (GH) signaling, in HF progression and outcomes. GH is crucial for cardiovascular and skeletal muscle function, and its deficiency has been associated with worse prognosis.

View Article and Find Full Text PDF

An Automated Workflow to Discover the Structure-Stability Relations for Radiation Hard Molecular Semiconductors.

J Am Chem Soc

January 2025

Institute of Materials for Electronics and Energy Technology (i-MEET), Department of Materials Science and Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Martensstraße 7, 91058 Erlangen, Germany.

Article Synopsis
  • Emerging photovoltaics require radiation-hard materials for use in outer space, but predicting their resilience to high-energy radiation is currently a challenge.
  • The research combines lab automation and machine learning to rapidly identify and test over 130 organic hole transport materials, assessing their stability under UVC light exposure.
  • Findings reveal that materials with fused aromatic rings are more stable, while certain chemical groups negatively impact stability, providing valuable insights for future molecular design in creating durable semiconductors.
View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!