Safety Guaranteed Robust Multi-Agent Reinforcement Learning with Hierarchical Control for Connected and Automated Vehicles

IEEE International Conference on Robotics and Automation (ICRA 2025)

1. School of Computing, University of Connecticut 2. Department of Electrical and Computer Engineering, Boston University 3. Department of Computer Science, University of Maryland

Abstract

We address the problem of coordination and control of Connected and Automated Vehicles (CAVs) in the presence of imperfect observations in mixed traffic environment. A commonly used approach is learning-based decision-making, such as reinforcement learning (RL). However, most existing safe RL methods suffer from two limitations: (i) they assume accurate state information, and (ii) safety is generally defined over the expectation of the trajectories. It remains challenging to design optimal coordination between multi-agents while ensuring hard safety constraints under system state uncertainties (e.g., those that arise from noisy sensor measurements, communication, or state estimation methods) at every time step. We propose a safety guaranteed hierarchical coordination and control scheme called Safe-RMM to address the challenge. Specifically, the high-level coordination policy of CAVs in mixed traffic environment is trained by the Robust Multi-Agent Proximal Policy Optimization (RMAPPO) method. Though trained without uncertainty, our method leverages a worst-case Q network to ensure the model's robust performances when state uncertainties are present during testing. The low-level controller is implemented using model predictive control (MPC) with robust Control Barrier Functions (CBFs) to guarantee safety through their forward invariance property. We compare our method with baselines in different road networks in the CARLA simulator. Results show that our method provides the best evaluated safety and efficiency in challenging mixed traffic environments with uncertainties.

Contribution

We propose a hierarchical decision-making framework, Safe-RMM, for CAVs in mixed traffic environments. The framework comprises of two levels whereby the top level is Robust MARL (the "RM" in Safe-RMM) that determines discrete actions conditioned on the behavior of other CAVs and HDVs. The low-level controller uses MPC (the final "M" in Safe-RMM) with CBFs to execute the high-level plan while guaranteeing safety to the neighborhood vehicles through the forward invariance property of CBFs.

To handle states uncertainties, we design the robust MARL algorithm which only requires training one more critic for the agents but no prior knowledge of uncertainties. Additionally, the MPC controller is incorporated with robust CBFs to consistently generate safe controls given MARL decisions, and to be endowed with the robustness against erroneous system state.

We validate through experiments in CARLA simulator that the proposed Safe-RMM approach significantly improves the collision-free rate and allows the CAV agents to achieve higher overall returns compared to baseline methods. Ablation studies further highlight the contributions of both the robust MARL algorithm and the MPC-CBF controller, as well as their reciprocal effects.

Method

The figure demonstrates an agent's decision pipeline while other agents share the same procedure. During training, both Value network and worst-Q network join the update of actor's policy. During testing, Agent \( i \) takes states with uncertainty to its actor and samples the high level action \( a_i \), which is subsequently handled by robust MPC controller for path-planning and generating safe control \( u_i \).

BibTeX

@article{zhang2023safety, title={Safety Guaranteed Robust Multi-Agent Reinforcement Learning with Hierarchical Control for Connected and Automated Vehicles}, author={Zhang, Zhili and Ahmad, HM and Sabouni, Ehsan and Sun, Yanchao and Huang, Furong and Li, Wenchao and Miao, Fei}, journal={arXiv preprint arXiv:2309.11057}, year={2024} }