The evolving landscape of cybersecurity threats, characterized by increasingly sophisticated and adaptive attackers, poses major challenges to traditional static network defense mechanisms. To address these limitations, this paper proposes an adaptive cyber defense framework that integrates Reinforcement Learning (RL) with Attack Graph (AG) modeling. The interaction between attacker and defender is formulated as a repeated zero-sum stochastic game over a partially observable Attack Graph-guided environment, allowing both agents to adapt their strategies through repeated interaction. Two value-based learning approaches are investigated, namely tabular Q-learning and Deep Q-Networks (DQN), under a unified attacker–defender setting. Experimental results across multiple training scenarios show that defender performance improves substantially as the training budget increases. Under limited training, Q-learning provides a computationally efficient and stable baseline, while DQN requires more training and careful tuning to achieve strong performance. However, with extended training, the DQN-based defender attains the highest win rate, albeit at a significantly greater computational cost. In addition, multi-run statistical comparisons highlight a clear trade-off between defensive effectiveness and runtime efficiency: Q-learning remains far more lightweight, whereas DQN offers stronger asymptotic performance when sufficient resources are available. These findings demonstrate the promise of learning-based adaptive defense over attack graphs while also emphasizing the importance of training budget, computational constraints, and model selection in practical cyber defense deployment.
Building similarity graph...
Analyzing shared references across papers
Loading...
Mohammed A. Makarem
Muneef A. Razaz
Zead Saleh
Future Internet
Queen's University
University of Business and Technology
Building similarity graph...
Analyzing shared references across papers
Loading...
Makarem et al. (Thu,) studied this question.
www.synapsesocial.com/papers/69fadaab03f892aec9b1e6f3 — DOI: https://doi.org/10.3390/fi18050239