What question did this study set out to answer?

The study aims to improve the adversarial robustness of vision-language models through a novel multi-modal tuning approach.

February 2, 2026

NAP-Tuning: Neural Augmented Prompt Tuning for Adversarially Robust Vision-Language Models

Key Points

The study aims to improve the adversarial robustness of vision-language models through a novel multi-modal tuning approach.
Introduced NAP-Tuning, an augmented prompt tuning framework for vision-language models.
Developed a multi-modal and multi-layer prompting system for targeted feature purification.
Implemented lightweight neural modules (TokenRefiners) to enhance feature reconstruction via residual connections.
Conducted comprehensive experiments across multiple datasets and attack types to evaluate performance.
NAP-Tuning outperformed existing methods significantly across various datasets and attack types.
Achieved improvements of 32.3% on ViT-B16 and 31.3% on ViT-B32 architectures compared to the strongest baselines under AutoAttack.
Demonstrated competitive clean accuracy while addressing adversarial vulnerabilities effectively.

Abstract

Vision-Language Models (VLMs) such as CLIP have demonstrated remarkable capabilities in understanding relationships between visual and textual data through joint embedding spaces. Despite their effectiveness, these models remain vulnerable to adversarial attacks, particularly in the image modality, posing significant security concerns. Building upon our previous work on Adversarial Prompt Tuning (AdvPT), which introduced learnable text prompts to enhance adversarial robustness in VLMs without extensive parameter training, we present a significant extension by introducing the Neural Augmentor framework for Multi-modal Adversarial Prompt Tuning (NAP-Tuning). As a significant extension, NAP-Tuning first establishes a comprehensive multi-modal (text and visual) and multi-layer prompting framework. The core of this framework is a targeted structural augmentation for feature-level purification, implemented through our Neural Augmentor approach. This framework implements feature purification by incorporating TokenRefiners-lightweight neural modules that learn to reconstruct purified features via residual connections-to directly address distortions in the feature space. This structural intervention is what enables the multi-modal and multi-layer system to effectively perform modality-specific and layer-specific feature rectification. Comprehensive experiments demonstrate that NAP-Tuning significantly outperforms existing methods across various datasets and attack types. Notably, our approach shows significant improvements over the strongest baselines under the challenging AutoAttack benchmark, outperforming them by 32.3% on ViT-B16 and 31.3% on ViT-B32 architectures while maintaining competitive clean accuracy. This work highlights the efficacy of internal feature-level intervention in prompt tuning for adversarial robustness, moving beyond input-side alignment approaches to create an adaptive defense mechanism that can identify and rectify adversarial perturbations across embedding spaces.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Jiaming Zhang

Xin Wang

Y. Ma

Journals

IEEE Transactions on Pattern Analysis and Machine Intelligence

Actions

Institutions

Fudan University

University of Naples Federico II

Beijing Jiaotong University

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

NAP-Tuning: Neural Augmented Prompt Tuning for Adversarially Robust Vision-Language Models

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider