Agentic AI systems—Large Language Models (LLMs) augmented with planning, tool use, memory, and long-horizon interactions—can execute complex tasks autonomously, but their multi-step trajectories introduce new failure modes that challenge trustworthiness. This survey provides a focused examination of trustworthy agentic AI through two core dimensions that are critical for high-risk deployments: Safety and Robustness and Privacy and System Security. For each dimension, we clarify key concepts, identify where risks emerge along the agent workflow, and summarize stage-targeted mitigation strategies. Other trustworthiness aspects (value alignment, transparency, fairness, and accountability) are discussed as relevant context rather than parallel chapters. To support consistent comparison and deployment decisions, we consolidate evaluation into a unified metrics-and-benchmarks hub, emphasizing both outcome and process signals (e.g., constraint violations, trace completeness, and adversarial success rates) and offering scenario-to-metric guidance for release gating. We conclude by outlining open challenges such as self-evolving agents, runtime monitoring and verification, privacy-preserving personalization, and the trust–utility trade-off, and present a case study of real-world security failures in open-source agentic systems (OpenClaw/Moltbook). Our goal is to serve as a practical reference for researchers and practitioners building trustworthy agentic systems in high-stakes environments.
Building similarity graph...
Analyzing shared references across papers
Loading...
Jinhu Qi
Muzhi Li
Jiahong Liu
Chinese University of Hong Kong
Fudan University
Shanghai Academy of Environmental Sciences
Building similarity graph...
Analyzing shared references across papers
Loading...
Qi et al. (Wed,) studied this question.
www.synapsesocial.com/papers/69f9892215588823dae18083 — DOI: https://doi.org/10.20935/acadai8260
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: