What question did this study set out to answer?

The aim is to explore how quantum optics can address reinforcement learning tasks effectively and scalably.

April 1, 2026Open Access

Scalable conflict-free bandit algorithm using a quantum optical setup

Key Points

The aim is to explore how quantum optics can address reinforcement learning tasks effectively and scalably.
Utilized the Orbital Angular Momentum (OAM) of photons for encoding player preferences.
Optimized phases of OAM amplitudes to ensure conflict avoidance.
Analyzed performance over existing classical methods in solving the Competitive Multi-Armed Bandit problem.
Achieved conflict avoidance through quantum interference techniques.
Demonstrated improved performance compared to classical approaches.
Showed scalability with an increasing number of options in the bandit problem.

Abstract

Abstract Quantum optics utilizes the unique properties of light for computation or communication. In this work, we explore its ability to solve certain reinforcement learning tasks, with a particular view towards the scalability of the approach. Our method utilizes the Orbital Angular Momentum (OAM) of photons to solve the Competitive Multi-Armed Bandit (CMAB) problem while maximizing rewards. In particular, we encode each player’s preferences in the OAM amplitudes, while the phases are optimized to avoid conflicts. We find that the proposed system is capable of solving the CMAB problem with a scalable number of options and demonstrates improved performance over existing techniques. Our method utilizes quantum interference to guarantee conflict avoidance using purely physical attributes of light in a way impossible for a classical setup. As an example of a system with simple rules for solving complex tasks, our OAM-based method adds to the repertoire of functionality of quantum optics.

Bookmark

View Full Paper

Cite This Study

Konaka et al. (Mon,) studied this question.

synapsesocial.com/papers/69ccb7b016edfba7beb89ce6 https://doi.org/https://doi.org/10.1038/s41534-026-01201-6

Bookmark

View Full Paper