What is the Visual Cognition Gap between Humans and Multimodal LLMs? | Synapse