What question did this study set out to answer?

This work aims to identify and address the challenges related to the robustness of AI models, focusing on trustworthy AI principles.

April 10, 2026Open Access

Advancing adversarial and LLM robustness in trustworthy AI: a comprehensive survey

Key Points

This work aims to identify and address the challenges related to the robustness of AI models, focusing on trustworthy AI principles.
Systematic review of robustness evaluation metrics and methods.
Identification of key issues and challenges in AI model robustness.
Exploration of enhancement strategies at various stages of the AI model lifecycle.
Examination of robustness issues specific to generative large language models.
Robustness is a critical factor limiting the adoption of AI models.
Current practices often lack sufficient explainability and predictability.
Various enhancement strategies are proposed, spanning data preprocessing, training, and model design.

Abstract

Despite significant advancements in various fields in recent years, artificial intelligence (AI) models have demonstrated strong performance and broad application potential. However, they still face numerous challenges in terms of security and robustness in practical applications. Among these, robustness stands out as a critical factor contributing to the perceived untrustworthiness of AI models and remains a major barrier to their widespread adoption. Moreover, most current AI models are designed with a black-box structure, lacking sufficient explainability, which makes it difficult for researchers to understand their decision-making mechanisms. This ‘invisible’ nature not only limits the ability to predict model behavior but also increases the instability of models in complex and unknown environments. In this paper, we systematically review the evaluation methods and enhancement strategies for AI model robustness from multiple perspectives: (1) We identify the primary issues and technical challenges in the robustness of current AI models. (2) We explore the connections and distinctions between core concepts of trustworthy AI. (3) We summarize the development of robustness evaluation in recent years from both the perspective of robustness evaluation metrics and methods. (4) We examine robustness enhancement methods across different stages of the AI model lifecycle: data preprocessing, training, model architecture design, and post-processing. (5) We focus on hallucinations and other robustness issues faced by generative large language models (LLMs), summarizing current research progress and mitigation strategies. (6) Finally, we discuss open questions and future research directions in the field of AI model robustness.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Qingzhe Tang

Jingwei Qian

Xiaozhi Du

Journals

Artificial Intelligence Review

Actions

Institutions

Xi'an Jiaotong University

State Grid Corporation of China (China)

Shanghai Electric (China)

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Advancing adversarial and LLM robustness in trustworthy AI: a comprehensive survey

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study