Artificial Intelligence 9 min read

Automated Attacker A² for Enhancing Model Robustness in Adversarial Training

The paper presents A², an automated, parameterized attacker that dynamically adjusts perturbation methods and step sizes during adversarial training, demonstrating improved robustness across multiple benchmarks with modest computational overhead, and outlines future directions for further efficiency and effectiveness in secure AI systems.

AntTech

Oct 31, 2022

Automated Attacker A² for Enhancing Model Robustness in Adversarial Training

Deep neural models are vulnerable to imperceptible adversarial perturbations, posing significant security challenges for applications such as Ant Group’s services. To strengthen model robustness, the authors propose an efficient automated attacker (A²) that generates optimal perturbations in real time during adversarial training.

The work, accepted at NeurIPS 2022, builds on the premise that stronger attacks lead to more robust models. Traditional adversarial training suffers from fixed attacker hyper‑parameters and high computational cost, limiting its scalability.

A² introduces a parameterized attack space composed of stacked attack cells, each containing discrete perturbation methods (e.g., FGM, FGSM, FGMM, FGSMM, Gaussian, Uniform, Identity) and a continuous step‑size range [1e‑4, 1]. By employing attention‑based selection over learned embeddings for each operation, the attacker automatically tunes both discrete and continuous hyper‑parameters for each sample and model state.

The authors detail the algorithmic pipeline, including a re‑parameterization trick to enable gradient flow through discrete choices and Monte‑Carlo estimation of expected loss. Experiments on CIFAR‑10 and other robust benchmarks show that A² achieves higher robust accuracy than standard PGD attacks while incurring only ~5% additional training overhead.

Visualization of the learned perturbation distributions reveals a progression from simple noise in early attack steps to momentum‑based methods in later steps, with dataset‑specific preferences. The paper concludes with future work on automated step‑count selection, richer loss functions (e.g., CW), and regularization to balance natural and robust accuracy.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

NeurIPS Machine Learning Security adversarial training automated attacker model robustness

Written by

AntTech

Technology is the core driver of Ant's future creation.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.