Ensuring safety in rein- forcement learning (RL)-based robotic systems is a critical challenge, especially in contact-rich tasks within unstructured environments. While the state-of-the-art safe RL approaches mitigate risks through safe exploration or high-level recovery mechanisms, they often overlook low-level execution safety, where reflexive responses to potential hazards are crucial. Similarly, variable impedance control (VIC) enhances safety by adjusting stiffness, yet lacks a systematic way to adapt parameters throughout the task. In this paper, we propose Bresa, a Bio-inspired Reflexive Hierarchical Safe RL method inspired by biological reflexes. Our method decouples task learning from safety learning, incorporating a safety critic network that evaluates action risks and operates at a higher frequency than the task solver. Unlike existing recovery-based methods, our safety critic functions at the low-level control layer, allowing real-time intervention when unsafe conditions arise. The task-solving RL policy, running at a lower frequency, focuses on high-level planning (decision-making), while the safety critic ensures instantaneous safety corrections.
We validate Bresa on multi tasks including a contact-rich robotic task, demonstrating its ability to enhance safety reflex- ively, and adaptability in unforeseen dynamic environments. Our results show that Bresa outperforms baseline, providing a robust and reflexive safety mechanism that bridges the gap between high-level planning and low-level execution. We show our real-world experiments video as below.
@ARTICLE{10517611,
author={Zhang, Heng and Solak, Gokhan and Lahr, Gustavo J. G. and Ajoudani, Arash},
journal={IEEE Robotics and Automation Letters},
title={SRL-VIC: A Variable Stiffness-Based Safe Reinforcement Learning for Contact-Rich Robotic Tasks},
year={2024},
volume={9},
number={6},
pages={5631-5638},
doi={10.1109/LRA.2024.3396368}}
This work was supported by the e Horizon Europe Project TORNADO (GA101189557).