What question did this study set out to answer?

The aim is to minimize carbon emissions during inference of large language models using quantum-enhanced scheduling.

February 8, 2026Open Access

Quantum-Enhanced Carbon-Aware Scheduling for Large Language Model Inference

Key Points

The aim is to minimize carbon emissions during inference of large language models using quantum-enhanced scheduling.
Developed a hybrid quantum-classical framework utilizing the quantum approximate optimization algorithm (QAOA).
Benchmarking performed against classical brute force and genetic algorithms in various environments.
Focused on layer-to-hardware mapping to optimize carbon emissions.
QAOA achieves near-perfect optimality with a gap of less than 10−5.
Successfully adapts to 24-hour grid carbon fluctuations, saving 23.76 gCO$_2$e.
Identified a 'Simulation Wall' at N=15 layers, limiting classical simulation while genetic algorithms remain faster.

Abstract

As the computational demand for Large Language Models (LLMs) surges, minimizing the carbon footprint of inference has become a critical challenge. While classical schedulers optimize for throughput, they often neglect the spatial and temporal variance of grid carbon intensity. This paper presents a Hybrid Quantum-Classical (HQC) framework utilizing the Quantum Approximate Optimization Algorithm (QAOA) to solve the layer-to-hardware mapping problem with the explicit objective of minimizing gCO₂e emissions. We benchmark our QAOA optimizer against classical Brute Force and Genetic Algorithms across static, dynamic, and noisy environments. Our results demonstrate that QAOA achieves near-perfect optimality (gap<10−5) and successfully adapts to 24-hour grid fluctuations, realizing a simulated carbon saving of 23. 76 gCO₂e. However, the study also reveals a "Simulation Wall" at N=15 layers, where classical simulation of the quantum circuit becomes computationally prohibitive, whereas Genetic Algorithms maintain speed at the cost of theoretical guarantees. We conclude that QAOA represents a scalable, robust pathway for green AI, provided the optimizer is migrated from simulation to physical Quantum Processing Units (QPUs).

Quantum-Enhanced Carbon-Aware Scheduling for Large Language Model Inference

Key Points

Abstract

Cite This Study