Effective decision-making requires adapting behaviour to changing outcomes. Across three experiments (n = 218), participants played a competitive number-selection game against computerized opponents with exploitable patterns in Pre and Post phases. In the intermediate During phase, win rate was fixed (low = punishment; high = reinforcement), disrupting or promoting the patterns learned in Pre. We indexed performance (win rate) and response-rule expression (optimal behaviour rate, OBR) across phases. Participants who aligned their behaviour with During feedback (facilitating under reinforcement, suppressing under punishment) were most likely to reuse the earlier pattern when it again became advantageous in Post. In contrast, stronger suppression of the prior pattern in During, irrespective of feedback valence, predicted better learning of a new pattern later. Negative outcomes selectively impaired learning-new but spared relearning-old information, consistent with inertia in reused patterns and greater flexibility after wins than losses. These findings outline feedback-based reuse versus change in strategic behaviour.
Zhang et al. (Thu,) studied this question.