Artificial Intelligence 9 min read

Automated Attacker A² for Enhancing Model Robustness in Adversarial Training

The paper presents A², an automated, parameterized attacker that dynamically adjusts perturbation methods and step sizes during adversarial training, demonstrating improved robustness across multiple benchmarks with modest computational overhead, and outlines future directions for further efficiency and effectiveness in secure AI systems.

AntTech
AntTech
AntTech
Automated Attacker A² for Enhancing Model Robustness in Adversarial Training

Deep neural models are vulnerable to imperceptible adversarial perturbations, posing significant security challenges for applications such as Ant Group’s services. To strengthen model robustness, the authors propose an efficient automated attacker (A²) that generates optimal perturbations in real time during adversarial training.

The work, accepted at NeurIPS 2022, builds on the premise that stronger attacks lead to more robust models. Traditional adversarial training suffers from fixed attacker hyper‑parameters and high computational cost, limiting its scalability.

A² introduces a parameterized attack space composed of stacked attack cells, each containing discrete perturbation methods (e.g., FGM, FGSM, FGMM, FGSMM, Gaussian, Uniform, Identity) and a continuous step‑size range [1e‑4, 1]. By employing attention‑based selection over learned embeddings for each operation, the attacker automatically tunes both discrete and continuous hyper‑parameters for each sample and model state.

The authors detail the algorithmic pipeline, including a re‑parameterization trick to enable gradient flow through discrete choices and Monte‑Carlo estimation of expected loss. Experiments on CIFAR‑10 and other robust benchmarks show that A² achieves higher robust accuracy than standard PGD attacks while incurring only ~5% additional training overhead.

Visualization of the learned perturbation distributions reveals a progression from simple noise in early attack steps to momentum‑based methods in later steps, with dataset‑specific preferences. The paper concludes with future work on automated step‑count selection, richer loss functions (e.g., CW), and regularization to balance natural and robust accuracy.

NeurIPSmachine learning securityadversarial trainingautomated attackermodel robustness
AntTech
Written by

AntTech

Technology is the core driver of Ant's future creation.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.