August 12, 2024Open Access

具有回顾模块化反思的分层上下文强化学习用于规划

Key Points

Key points are not available for this paper at this time.

Abstract

大型语言模型（LLMs）在各种语言任务中表现出卓越能力，使其成为机器人决策的有力候选者。受层次强化学习（HRL）的启发，我们提出了分层上下文强化学习（HCRL）这一新框架，它通过基于LLM的高层策略，将复杂任务动态分解为子任务。子任务由目标定义，并分配给低层策略执行。一旦LLM代理确定目标完成，将提出新目标。为了提升代理在多次执行中的表现，我们提出了回顾模块化反思（HMR），该方法不是反思整个轨迹，而是用中间目标替代任务目标，让代理反思较短轨迹以提高反思效率。我们在三个基准环境——ALFWorld、Webshop和HotpotQA中评估了HCRL的决策能力。结果显示，在5次执行中，HCRL相较于强大的上下文学习基线，性能提升了9%、42%和10%。

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Chuanneng Sun

Songjun Huang

Dario Pompili

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

具有回顾模块化反思的分层上下文强化学习用于规划

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider