PPO-GPR: A Custom Proximal Policy Optimization Tool for Active Reinforcement Learning | Synapse