Recent research has shown that neural information retrieval techniques may be susceptible to adversarial attacks. Adversarial attacks seek to manipulate the ranking of documents, with the intention of exposing users to targeted content. In this paper, we introduce the Embedding Perturbation Rank Attack ( EMPRA ) method, a novel approach designed to perform adversarial attacks on black-box Neural Ranking Models (NRMs). EMPRA manipulates sentence-level embeddings, guiding them towards pertinent context related to the query while preserving semantic integrity. This process generates adversarial texts that seamlessly integrate with the original content and remain imperceptible to humans. Our extensive evaluation conducted on the widely-used MS MARCO V1 passage collection as well as the TREC DL 2019 and TREC DL 2020 benchmarks, demonstrate the effectiveness of EMPRA against a wide range of state-of-the-art baselines in promoting a specific set of target documents within a given ranked results. Specifically, on MS MARCO Dev set queries, EMPRA successfully achieves a re-ranking of almost 96% of target documents originally ranked between 51-100 to rank within the top 10. Furthermore, EMPRA does not rely on surrogate models for generating adversarial documents, enhancing its robustness against various victim NRMs in realistic settings.
Building similarity graph...
Analyzing shared references across papers
Loading...
Amin Bigdeli
Negar Arabzadeh
Ebrahim Bagheri
ACM Transactions on Information Systems
University of California, Berkeley
University of Toronto
University of Waterloo
Building similarity graph...
Analyzing shared references across papers
Loading...
Bigdeli et al. (Fri,) studied this question.
www.synapsesocial.com/papers/69b5ff8d83145bc643d1c5ea — DOI: https://doi.org/10.1145/3801943