Context-Free Word Importance Scores for Attacking Neural Networks

Authors

  • Nimrah Shakeel Teradata, Pakistan
  • Saifullah Shakeel Lahore University of Management Sciences, Pakistan

DOI:

https://doi.org/10.47852/bonviewJCCE2202406

Keywords:

neural networks, adversarial attacks, NLP

Abstract

Leave-One-Out (LOO) scores provide estimates of feature importance in neural networks, for adversarial attacks. In this work, we present context-free word scores as a query-efficient alternative. Experiments show that these approximations are quite effective for black box attacks on neural networks trained for text classification, particularly for CNNs. The model query count for this method scales as 0(vocan_size * model_input_length). It is independent of the number of examples and features to be perturbed.

 

Received: 13 July 2022 | Revised: 18 July 2022 | Accepted: 24 August 2022

 

Conflicts of Interest

The authors declare that they have no conflicts of interest to this work.

Metrics

Metrics Loading ...

Downloads

Published

2022-09-27

How to Cite

Shakeel, N., & Shakeel, S. (2022). Context-Free Word Importance Scores for Attacking Neural Networks. Journal of Computational and Cognitive Engineering, 1(4), 187–192. https://doi.org/10.47852/bonviewJCCE2202406

Issue

Section

Research Articles