Baidu Geek Talk
Mar 28, 2022 · Artificial Intelligence
Robust Input Visualization Methods for Vision Transformers
The paper proposes a robust Grad‑CAM‑inspired visualization for Vision Transformers that combines attention weights and gradients to generate class‑specific saliency maps, demonstrates superior alignment with discriminative regions across ViT, Swin and Volo models, and shows a 76% false‑positive reduction in Baidu’s porn‑content risk control system.
Deep LearningGrad-CAMInput Visualization
0 likes · 11 min read