Revisiting the Adversarial Robustness of Vision Language Models: a Multimodal Perspective | Synapse