Reinforcement learning (RL) has become a significant research focus in machine learning due to its ability to generate dynamic data through interaction with the environment, without requiring large-scale labeled datasets. This characteristic makes RL particularly suitable for applications where data is scarce or difficult to obtain. In the realm of software testing, traditional methods such as regression testing often suffer from long execution times and low state coverage, which can hinder the efficiency and thoroughness of the testing process. To address these challenges, this paper proposes a novel software testing framework named STRL (Software Testing with Reinforcement Learning). The framework employs the Proximal Policy Optimization (PPO) algorithm, a powerful RL technique known for its efficiency in balancing exploration and exploitation. PPO enables STRL to dynamically adapt its testing strategy based on real-time feedback from the software environment, thereby optimizing the testing process. Experimental results show that STRL significantly improves state coverage and reduces testing time compared to both manual testing and traditional automated script testing. By leveraging RL, STRL can more effectively identify critical states and transitions, leading to more comprehensive and efficient testing outcomes. This study demonstrates the potential of RL in enhancing software testing and suggests that STRL could serve as a valuable tool for improving the quality and efficiency of software development processes.
Hanwen et al. (Wed,) studied this question.