Shields for Safe Reinforcement Learning

Research output: Contribution to journalArticlepeer-review

Abstract

Reinforcement learning (RL) is a prominent machine learning technique used to optimize an agent’s performance in potentially unknown environments. Despite its popularity and success, RL lacks safety guarantees, both during the learning phase and deployment. This paper reviews a runtime enforcement method called shielding that ensures provable safety for RL. We describe the underlying models, the types of guarantees that can be delivered, and the process of computing shields. Furthermore, we describe several techniques for integrating shields into RL, discuss the advantages and potential drawbacks of this integration, and highlight the current challenges in shielded learning. Evaluating the advantages and potential drawbacks of shielding as a method for safe RL.

Original languageEnglish
Pages (from-to)80 - 90
Number of pages11
JournalCommunications of the ACM
Volume68
Issue number11
DOIs
Publication statusPublished - 20 Oct 2025

Keywords

  • Game Theory
  • Model Checking
  • Reinforcement Learning
  • Runtime Enforcement
  • Safe Learning
  • Shielding

ASJC Scopus subject areas

  • General Computer Science

Fields of Expertise

  • Information, Communication & Computing

Fingerprint

Dive into the research topics of 'Shields for Safe Reinforcement Learning'. Together they form a unique fingerprint.

Cite this