Although adversarial training (AT) is regarded as a potential defense against backdoor attacks, AT and its variants have only yielded unsatisfactory results or have even inversely strengthened backdoor attacks. The large discrepancy between expectations and reality motivates us to thoroughly evaluate the effectiveness of AT against backdoor attacks across various settings for AT and backdoor attacks. We find that the type and budget of perturbations used in AT are important, and AT with common perturbations is only effective for certain backdoor trigger patterns. Based on these empirical findings, we present some practical suggestions for backdoor defense, including relaxed adversarial perturbation and composite AT. This work not only boosts our confidence in AT's ability to defend against backdoor attacks but also provides some important insights for future research.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TNNLS.2023.3281872DOI Listing

Publication Analysis

Top Keywords

backdoor attacks
24
adversarial training
8
backdoor
8
attacks
6
effectiveness adversarial
4
training backdoor
4
attacks adversarial
4
training regarded
4
regarded potential
4
potential defense
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!