What question did this study set out to answer?

This research aims to systematically review the advancements and safety challenges in video and image security related to large AI models.

March 13, 2026Open Access

大模型时代的视频与图像安全研究进展

Key Points

This research aims to systematically review the advancements and safety challenges in video and image security related to large AI models.
Systematic review of technologies related to understanding and generating image/video safety.
Summary of evolution in supervised, semi-supervised, weakly supervised, and unsupervised anomaly detection methods.
Analysis of risks in image/video generation and detection methods for deep fakes.
Identified new paradigms for zero-shot, open vocabulary, and explainable anomaly detection based on visual-language models.
Discussed the application of generative adversarial networks and diffusion models in security.
Highlighted key challenges and future directions for image and video security research.

Abstract

随着多模态大模型与生成式人工智能技术的快速发展，图像与视频的获取、理解与生成方式正在发生深刻变革。以视觉—语言预训练模型和扩散生成模型为代表的新一代人工智能体系，在语义对齐、跨模态理解与高保真内容生成等方面展现出强大的能力，显著推动了智能安防、内容生产、工业检测和公共治理等应用场景的发展。然而，视觉智能能力的快速扩张也带来了日益突出的安全风险与治理挑战：在理解层面，模型在复杂环境、开放场景和弱监督条件下易产生误判、偏差与鲁棒性不足；在生成层面，高保真合成图像与视频被滥用于深度伪造、虚假信息传播和隐私侵犯，对社会信任与公共安全构成威胁。因此，围绕“大模型时代的视频与图像安全”开展系统性研究具有重要的理论价值与现实意义。本文从图像与视频理解安全和图像与视频生成安全两条主线出发，系统综述了相关技术的研究进展。在理解安全方面，重点总结了全监督、半监督、弱监督和无监督异常检测方法的技术演进，并进一步归纳了基于视觉—语言大模型的零样本、开放词汇和可解释异常检测新范式；在生成安全方面，围绕生成对抗网络与扩散模型的发展脉络，系统分析了图像与视频生成技术的安全风险、深度伪造检测方法及其在政策监管与工程实践中的应用现状。最后，本文讨论了当前研究面临的关键挑战，并展望了大模型时代图像与视频安全研究的未来发展趋势，为相关领域的学术研究与工程应用提供参考。

Bookmark

View Full Paper

Cite This Study

Nong et al. (Thu,) studied this question.

synapsesocial.com/papers/69b3aad702a1e69014ccb8a6 https://doi.org/https://doi.org/10.11834/jig.250656

Bookmark

View Full Paper