What question did this study set out to answer?

To develop novel reinforcement learning algorithms for high-dimensional problems and practical applications in robotics.

March 15, 2026Open Access

Novel distributional reinforcement and ensemble learning algorithms

Key Points

To develop novel reinforcement learning algorithms for high-dimensional problems and practical applications in robotics.
Developed a Cramér-based Soft Distributional Soft Actor-critic algorithm within the maximum-entropy Actor-Critic framework.
Investigated confidence-driven model updates for improved value function approximation.
Designed a Reinforcement Learning - Inverse Kinematics (RL-IK) meta-algorithm for robotic control tasks.
Explored supervised learning techniques for state identification, focusing on image classification.
Demonstrated superior performance of C-DSAC in complex environments compared to other RL algorithms.
Showed enhanced convergence to near-optimal policies in robotic applications using RL-IK.
Validated new supervised learning approaches for state identification in image classification tasks.

Abstract

This dissertation focuses on Deep Reinforcement Learning (DRL), a neural network-based approach for solving Markov Decision Processes in high-dimensional spaces with unknown transition dynamics. The main contribution of this thesis is the development of a novel state-of-the-art distributional reinforcement learning algorithm within the maximum-entropy Actor-Critic framework. This algorithm, termed ”Cramér-based Soft Distributional Soft Actor-critic” (C-DSAC), demonstrates superior performance to other RL algorithms, especially in environments with high-dimensional spaces and complex dynamics. Its performance is shown to be partly rooted in a phenomenon arising in Cram´er-metric-based Distributional Reinforcement Learning, referred to as confidence-driven model updates. This mechanism ensures that the value function approximator is updated more conservatively when confidence in its estimates is low. Theoretical justifications for the algorithm are provided, demonstrating its convergence in the policy evaluation setting and, under widely accepted mild assumptions, in the control setting as well. Beyond foundational algorithmic research, this thesis contributes to the practical application of RL in robotics. Given the crucial role of multi-joint robotic systems in modern production technology, a RL meta-algorithm called ”Reinforcement Learning - Inverse Kinematics” (RL-IK) is devised. This approach enhances the applicability of reinforcement learning to robotic control tasks by significantly accelerating convergence to near-optimal policies compared to standard RL methods. An essential prerequisite for real-world RL applications in control systems is machine perception for state identification. To address challenges in this field, this thesis explores novel Supervised Learning (SL) approaches, validated on image classification tasks, with a focus on ensemble learning strategies.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Vanya Aziz

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Novel distributional reinforcement and ensemble learning algorithms

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study