What question did this study set out to answer?

To design a robust adaptive safe critic control for stochastic multiagent systems under asymmetric constraints.

March 7, 2026

Time-Varying HJBE-Based Adaptive Safe Critic Control Design for Stochastic Asymmetric Constrained Multiagent Systems

Key Points

To design a robust adaptive safe critic control for stochastic multiagent systems under asymmetric constraints.
Proposed a unified transformation function to handle asymmetric state constraints.
Formulated a time-varying Hamilton-Jacobi-Bellman equation to integrate stochastic factors.
Developed an integral reinforcement learning algorithm for improved data use and reduced drift dynamics reliance.
Implemented a time-varying single-critic network for optimal policy generation.
Incorporated experience replay to enhance learning efficiency.
Demonstrated the proposed method effectively reduces computational complexity.
Achieved significant robustness in control policies under stochastic disturbances.
Simulation examples validated the feasibility and efficiency of the approach.

Abstract

In this article, we investigate the problem of adaptive safe critic control design for stochastic multiagent systems (MASs) subject to asymmetric state and input constraints. To systematically address asymmetric state constraints, a unified transformation function (UTF) is proposed to convert the constrained consensus control problem into the stability analysis of an unconstrained error system. In addition, a nonquadratic cost function is incorporated to address input limitations effectively. Building upon these developments, a time-varying Hamilton-Jacobi-Bellman equation (HJBE) is formulated by integrating the Bellman optimality principle with Itô's lemma, thereby accommodating stochastic disturbances and enhancing controller robustness. To improve data utilization and eliminate reliance on explicit drift dynamics, an integral reinforcement learning (IRL) algorithm is developed within this framework. Furthermore, a time-varying single-critic network is designed to approximate the solution to the HJBE and generate optimal control policies, thereby considerably reducing computational complexity. To further enhance learning efficiency and relax the persistent excitation (PE) condition, the experience replay (ER) technique is incorporated into the update process of the critic weight. Finally, two simulation examples are provided to verify the feasibility and effectiveness of the proposed approach.

Bookmark

Time-Varying HJBE-Based Adaptive Safe Critic Control Design for Stochastic Asymmetric Constrained Multiagent Systems

Key Points

Abstract

Cite This Study