Deep learning methods have gained increased attention in various applications
due to their outstanding performance. For exploring how this high performance
relates to the proper use of data artifacts and the accurate problem
formulation of a given task, interpretation models have become a crucial
component in developing deep learning-based systems. Interpretation models
enable the understanding of the inner workings of deep learning models and
offer a sense of security in detecting the misuse of artifacts in the input
data. Similar to prediction models, interpretation models are also susceptible
to adversarial inputs. This work introduces two attacks, AdvEdge and
AdvEdge$^{+}$, that deceive both the target deep learning model and the coupled
interpretation model. We assess the effectiveness of proposed attacks against
two deep learning model architectures coupled with four interpretation models
that represent different categories of interpretation models. Our experiments
include the attack implementation using various attack frameworks. We also
explore the potential countermeasures against such attacks. Our analysis shows
the effectiveness of our attacks in terms of deceiving the deep learning models
and their interpreters, and highlights insights to improve and circumvent the
attacks.
Dedicated Forum to help removing adware, malware, spyware, ransomware, trojans, viruses and more!
More Stories
Career Spotlight: Logistics Engineer
Scamming the Scammers: Using ChatGPT to Reply Mails for Wasting Time and Resources. (arXiv:2303.13521v1 [cs.CR])
A Novel Approach to the Behavioral Aspects of Cybersecurity. (arXiv:2303.13621v1 [cs.HC])