May 1, 2024Open Access

다중 단계 희소 보상 강화 학습 가속화

Key Points

Key points are not available for this paper at this time.

Abstract

최근 몇 년간 딥 강화 학습(DRL)의 큰 성공 이후, 실제 세계의 더 복잡한 작업에 가까운 DRL 알고리즘을 가속화하는 방법 개발이 점점 중요해지고 있습니다. 특히, 여러 하위 작업이나 중간 단계를 포함하며 작업 완료 시점에서만 희소한 보상을 제공하는 장기 과제에 관한 연구가 부족합니다. 본 논문은 1) 인간의 선험 지식을 사용하여 작업을 분해하고 추상적인 시범—탐색과 학습을 안내하는 올바른 단계 순서—을 제공하며, 2) 정책의 온라인 성과에 따라 탐색 매개변수를 적응적으로 조정할 것을 제안합니다. 제안된 아이디어는 세 가지 인기 DRL 알고리즘에 구현되었으며, 그리드월드 및 조작 작업에 대한 실험 결과는 제안 기법의 개념과 효과를 입증합니다.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Xiang 등(수요일)은 이 문제를 연구했습니다.

www.synapsesocial.com/papers/68e6c6ecb6db64358764564e — DOI: https://doi.org/10.18573/conf1.u

Also consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Continuous control with deep reinforcement learning· 2015 · 5,371 citations
Reinforcement Learning: An Introduction· 2000 · 8,686 citations
An Open-Source Multi-goal Reinforcement Learning Environment for Robotic Manipulation with Pybullet· 2021 · 17 citations
Human-level control through deep reinforcement learning· 2015 · 29,614 citations
MuJoCo: A physics engine for model-based control· 2012 · 4,392 citations

Authors

Yang Xiang

Zhigang Ji

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

다중 단계 희소 보상 강화 학습 가속화

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Also consider

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion