Adversarial Examples

Inputs with small, imperceptible perturbations that cause AI models to produce incorrect outputs with high confidence.

1.
Goodfellow et al. (2014) demonstrated that adding human-imperceptible pixel noise to a panda image causes GoogLeNet to classify it as a gibbon with 99.3% confidence - founding the adversarial ML research field.
2.
IBM's Adversarial Robustness Toolbox is used by financial institutions to test credit-scoring models against adversarial feature perturbations - identifying when small data changes flip a loan decision.
3.
Automotive OEMs test object detection systems against adversarial stop-sign stickers - physically printed adversarial patterns on stop signs cause some detection models to misclassify them at road speed.

Loading…