What type of study is this?

This is a Literature Review study.

September 5, 2025Open Access

Symmetry-Aware Advances in Multimodal Large Language Models: Architectures, Training, and Evaluation

Key Points

Multimodal large language models demonstrate strong reasoning abilities across various tasks and modalities.
The survey systematically reviews architectures, training methodologies, and evaluation techniques in the realm of MLLMs.
Emerging trends in MLLMs focus on balanced integration of modalities, tasks, and symmetry-driven reasoning.
Key challenges in multimodal understanding are critically examined, offering potential solutions based on recent advancements.

Abstract

With the exponential growth of multimodal data, the limitations of traditional unimodal models in cross-modal understanding and complex scenario reasoning have become increasingly evident. Built upon the foundation of Large Language Models (LLMs), Multimodal Large Language Models (MLLMs) retain strong reasoning abilities and demonstrate unique capabilities in multimodal understanding. This survey provides a comprehensive overview of the current research landscape of MLLMs. It systematically analyzes mainstream model architectures, training, fine-tuning strategies, and task classifications, while offering a structured account of evaluation methodologies. Beyond synthesis, the paper highlights emerging trends that aim for balanced integration across modalities, tasks, and components, and critically examines key challenges together with potential solutions. The survey specifically emphasizes recent reasoning-oriented MLLMs, with a focus on DeepSeek-R1, analyzing their design paradigms and contributions from the perspective of symmetric reasoning capabilities. Overall, this work offers a comprehensive overview of cutting-edge advancements and lays a foundation for the future development of MLLMs, especially those guided by symmetry principles.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Xinran Liu

Haojie Liu

Journals

Symmetry

Actions

Institutions

Zhejiang University

Sichuan University

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Symmetry-Aware Advances in Multimodal Large Language Models: Architectures, Training, and Evaluation

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study