Supplementary material for the paper "Think Fast! Learning to Control Online Reasoning in Stochastic Environments", expected to be published in In Proc. of the 25th International Conference on Autonomous Agents and Multiagent Systems (AAMAS '26).
Budd et al. (Wed,) studied this question.