What question did this study set out to answer?

This research aims to improve mobile robot navigation through enhanced safety mechanisms and efficiency.

May 6, 2026Open Access

An Adaptive QAPF Framework with a Discrete CBF-Inspired Safety Filter and Adaptive Reward Shaping for Safe Mobile Robot Navigation

Key Points

This research aims to improve mobile robot navigation through enhanced safety mechanisms and efficiency.
Extended the Q-learning with Artificial Potential Field (QAPF) through coordinated mechanisms.
Implemented a discrete Control Barrier Function-inspired safety filter and adaptive reward shaping.
Conducted tests on held-out static maps with 30 independent seeds.
Convergence horizon reduced from ∼3×106 episodes to approximately 200-230 episodes.
Collision rate significantly lowered from 6.2% to 0.3%, while maintaining a task completion rate of 93.8%.
Achieved 94.5±2.1% success in navigation with adaptive QAPF and 85.0±4.1% success in disturbance scenarios with QAPF+CBF.

Abstract

Mobile robot navigation remains challenging when fast convergence, collision avoidance and deployability must be satisfied simultaneously. The original Q-learning with Artificial Potential Field (QAPF) paradigm is extended in this paper with three coordinated mechanisms that together yield a reported-horizon convergence reduction of approximately four orders of magnitude (from ∼3×106 episodes to ∼200 to 230 episodes under the present protocol) and an internal-ablation collision-rate reduction of approximately one order of magnitude (6.2% to 0.3%), and that open a new capability frontier covering dynamic obstacles, multi-robot coordination, energy-aware velocity modulation and embedded-deployable inference timing. The first mechanism is a potential-based reward-shaping schedule whose unclipped fixed-weight form follows the policy-invariant shaping theorem, while the implemented clipped and time-varying form is used as an empirically stable approximation. Under the present experimental protocol, the reported convergence horizon is reduced from the ∼3×106 episodes reported for the original QAPF formulation to approximately 200 to 230 episodes; this comparison is protocol-dependent and is not claimed as a controlled one-to-one runtime speedup. The second mechanism is a discrete Control Barrier Function (CBF)-inspired action filter (thediscrete filter described in this paper is inspired by the continuous-time CBF literature, but does not carry a forward-invariance proof; it is used as an empirical safety mechanism rather than as a formal Control Barrier Function in the formal continuous-time sense) with per episode visit memory by which the held-out collision rate is reduced from 6.2% for QAPF alone to 0.3% while 93.8% task completion is maintained, where this collision-rate comparison is internal to the QAPF ablation because the prior QAPF reference does not report a comparable held-out collision metric. The third mechanism is a set of extensions to dynamic obstacles, two-robot cooperative navigation under a centralized scheme (with an explicit O(N2) scaling-cost analysis and three decentralization strategies for fleets beyond the small-N regime), curriculum learning and energy-aware velocity modulation. Disturbance robustness tests, empirical timeout/stagnation detection for unreachable-goal cases, i7 reference inference timing with projected embedded-device latencies, multi-axis generalization over obstacle density and grid size, scalability analysis for centralized multi-robot coordination and a scope comparison against A* and RRT* are added by the revised evaluation. Across 30 independent seeds on held-out static maps, 94.5±2.1% success is achieved by adaptive QAPF while 93.8±2.3% success with 0.3±0.4% collisions is achieved by QAPF+CBF. Under a separate finite robustness suite, 85.0±4.1% success is retained by QAPF+CBF in the combined disturbance regime. The timing study indicates that the 20 Hz real-time threshold is comfortably exceeded by all methods on the measured i7 reference platform and by all projected embedded-device equivalents. The results show that a lightweight and safety-oriented navigation policy for grid-based mobile-robot settings can be provided by APF-guided tabular reinforcement learning when it is paired with a discrete safety filter and a clarified energy and robustness analysis.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Elizabeth Isaac

Asha J. George

Iacovos Ioannou

Journals

Electronics

Actions

Institutions

University of Cyprus

Jain University

Koneru Lakshmaiah Education Foundation

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

An Adaptive QAPF Framework with a Discrete CBF-Inspired Safety Filter and Adaptive Reward Shaping for Safe Mobile Robot Navigation

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study