What type of study is this?

This is a Experimental Study study.

October 13, 2025Open Access

Improving Retrospective Language Agents via Joint Policy Gradient Optimization

Key Points

RetroAct framework significantly enhances performance in language agents, enabling continuous learning.
The joint optimization process combines imitation learning and reinforcement learning to improve capabilities.
Extensive experiments demonstrate substantial improvements in decision-making and task performance across environments.
This method reduces dependency on closed-source models while improving training stability and data efficiency.

Abstract

In recent research advancements within the community, large language models (LLMs) have sparked great interest in creating autonomous agents. However, current prompt-based agents often heavily rely on large-scale LLMs. Meanwhile, although fine-tuning methods significantly enhance the capabilities of smaller LLMs, the fine-tuned agents often lack the potential for self-reflection and self-improvement. To address these challenges, we introduce a novel agent framework named RetroAct, which is a framework that jointly optimizes both task-planning and self-reflective evolution capabilities in language agents. Specifically, we develop a two-stage joint optimization process that integrates imitation learning and reinforcement learning, and design an off-policy joint policy gradient optimization algorithm with imitation learning regularization to enhance the data efficiency and training stability in agent tasks. RetroAct significantly improves the performance of open-source models, reduces dependency on closed-source LLMs, and enables fine-tuned agents to learn and evolve continuously. We conduct extensive experiments across various testing environments, demonstrating RetroAct has substantial improvements in task performance and decision-making processes.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Feng et al. (Mon,) studied this question.

www.synapsesocial.com/papers/68ece2abd1bb2827d129729d — DOI: https://doi.org/10.48550/arxiv.2503.01490

Authors

Xueyang Feng

Bo Lan

Quanyu Dai

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Improving Retrospective Language Agents via Joint Policy Gradient Optimization

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion