Home  | Publications | SHW+25a

Learning Control With Simulated Variational Quantum Policies in a Surrogate Cart-Pole Environment

MCML Authors

Abstract

We explore the use of pure quantum policies, implemented via variational quantum circuits (VQCs), for offline reinforcement learning (RL). In contrast to hybrid models, our policy architecture contains no classical neural layers. Built on the MOOSE framework, we replace the classical policy with a VQC enhanced by trainable input and output weights. The policy is trained entirely offline using synthetic rollouts from a learned surrogate model of a physical cart-pole system. Evaluation in this simulated environment shows that the quantum policy performs on par with the classical baseline in terms of stability, smoothness, and reward accumulation. These results demonstrate that purely quantum models can effectively learn control strategies in model-based offline RL, offering a promising step toward real-world quantum-enhanced decision-making.

inproceedings SHW+25a


QCE 2025

IEEE International Conference on Quantum Computing and Engineering. Albuquerque, NM, USA, Aug 31-Sep 05, 2025.

Authors

Y. Sun • M. Hagog • M. Weber • D. Hein • S. Udluft • Y. MaV. Tresp

Links

DOI

Research Area

 A3 | Computational Models

BibTeXKey: SHW+25a

Back to Top