This study presents a comparative evaluation using risk-adjusted metrics of performances among various reinforcement learning algorithms – Q-learning, Deep Q-Networks (DQN), and Proximal Policy Optimization (PPO) (Schulman et al.) models. Stock price data for market structures including AAPL, MSFT, and SPY covering the time period from January 1, 2012 until June 1, 2014 were used in the study, while each algorithm had been comprehensively trained, taking into account of various evaluation metrics including Sharpe and Sortino ratios, maximum drawdown, total return, and number of trades. The results had indicated that the Proximal Policy Optimization (PPO) agent had outperformed the two supplemental algorithms in terms of overall profitability. Across a majority of the evaluated metrics and stocks, the Proximal Policy Optimization (PPO) agent consistently outperformed both the DQN and Q-learning regarding general profitability. However, there is a narrow range of market structures alongside stocks within the program, possibly inhibiting more concrete results.
Building similarity graph...
Analyzing shared references across papers
Loading...
Ayaan Prasad
Building similarity graph...
Analyzing shared references across papers
Loading...
Ayaan Prasad (Tue,) studied this question.
synapsesocial.com/papers/69e07dfe2f7e8953b7cbef6a — DOI: https://doi.org/10.5281/zenodo.19570097