What question did this study set out to answer?

The research aims to develop an AI-based decision support system for UAVs operating in environments with limited guidance and communication.

April 17, 2026Open Access

Artificial Intelligence-Based Decision Support System for UAV Control in a Simulated Environment

Key Points

The research aims to develop an AI-based decision support system for UAVs operating in environments with limited guidance and communication.
Investigated reinforcement learning strategies for UAV control in a 3D simulated environment.
Compared the performance of two policy-gradient methods: REINFORCE and Proximal Policy Optimization (PPO).
Implemented reward-driven adaptation techniques for UAV interaction with its environment.
PPO-based approach demonstrated higher mission effectiveness than the REINFORCE approach in unseen test scenarios.
Results support the relevance of deep reinforcement learning in enhancing UAV operations under uncertain conditions.

Abstract

Unmanned aerial vehicles (UAVs) are increasingly deployed in missions that require high autonomy and reliable decision-making; however, many operational concepts still assume access to GNSS and stable communication with a human operator. In contested environments, this assumption may no longer hold because GNSS degradation, radio-frequency interference, and intentional jamming can disrupt positioning and communication, thereby reducing mission effectiveness and safety. Recent surveys show that operation in GNSS-denied environments remains a major challenge and often requires alternative perception, localization, and control strategies. In response, this article investigates a reinforcement learning (RL)-based decision-support system for the autonomous control of a quadrotor UAV in a three-dimensional simulated environment. Rather than following pre-programmed waypoints, the UAV learns a control policy through interaction with the environment and reward-driven adaptation. The proposed system is designed for mission execution under uncertainty, limited external guidance, and partial observability. Two policy-gradient approaches are implemented and compared: classical REINFORCE and Proximal Policy Optimization (PPO) with an Actor–Critic architecture. The study presents the simulation environment, state and action representation, reward formulation, staged training procedure, and comparative evaluation. The results indicate that, within the considered unseen test scenario, the PPO-based configuration achieved higher mission effectiveness than REINFORCE in the final unseen test scenario, supporting the practical relevance of structured deep reinforcement learning for UAV operation in GPS-denied and communication-constrained environments.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Przemysław Sujecki

Damian Frąszczak

Journals

Sensors

Actions

Institutions

Military University of Technology in Warsaw

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Artificial Intelligence-Based Decision Support System for UAV Control in a Simulated Environment

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study