January 17, 2021


Small Input Noise is Enough to Defend Against Query-based Black-box Attacks. (arXiv:2101.04829v1 [cs.CR])

While deep neural networks show unprecedented performance in various tasks,
the vulnerability to adversarial examples hinders their deployment in
safety-critical systems. Many studies have shown that attacks are also possible
even in a black-box setting where an adversary cannot access the target model’s
internal information. Most black-box attacks are based on queries, each of
which obtains the target model’s output for an input, and many recent studies
focus on reducing the number of required queries. In this paper, we pay
attention to an implicit assumption of these attacks that the target model’s
output exactly corresponds to the query input. If some randomness is introduced
into the model to break this assumption, query-based attacks may have
tremendous difficulty in both gradient estimation and local search, which are
the core of their attack process. From this motivation, we observe even a small
additive input noise can neutralize most query-based attacks and name this
simple yet effective approach Small Noise Defense (SND). We analyze how SND can
defend against query-based black-box attacks and demonstrate its effectiveness
against eight different state-of-the-art attacks with CIFAR-10 and ImageNet
datasets. Even with strong defense ability, SND almost maintains the original
clean accuracy and computational speed. SND is readily applicable to
pre-trained models by adding only one line of code at the inference stage, so
we hope that it will be used as a baseline of defense against query-based
black-box attacks in the future.