[arXiv'22] Multi-Agent Vulnerability Discovery for Autonomous Driving with Hazard Arbitration Reward

The VDARS framework.

Abstract

Discovering hazardous driving scenarios is crucial for further improvement of autonomous driving policies. However, conducting efficient discovery of hazardous scenarios faces two key challenges. On the one hand, the probability of naturally encountering hazardous scenarios is low for well-trained driving policies. On the other hand, it is necessary to determine the accident responsibility properly, since scenarios with wrongly-attributed responsibilities are not valuable for refining the under-test driving policy. Hence, we aim to discover hazardous scenarios that are autonomous-vehicle responsible (AV-responsible), i.e., vulnerabilities presented by the under-test driving policy. To this end, this work proposes a Vulnerability Discovery framework by finding AV-Responsible Scenarios (VDARS) based on multi-agent reinforcement learning. VDARS efficiently guides other traffic participants to produce AV-responsible scenarios where the under-test driving policy misbehaves. The key designs in VDARS are Hazard Arbitration Reward (HAR) and Scenarios Distinction Intrinsic Rewards (SDIR), which enable our framework to discover rare, valuable and diverse AV-responsible hazardous scenarios. VDARS also enables us to extract the key causal actions of a hazardous scenario, which are meaningful for the automatic analysis of the scenarios. Experimental results on Baidu Apollo and other types of driving policies demonstrate that VDARS can discover more diverse AV-responsible hazardous scenarios more efficiently than existing methods, which are valuable for further improvements of autonomous driving policies.

Xuefei Ning
Xuefei Ning
Research Assistant Professor at Tsinghua University

My primary research interests are neural architecture search, efficient deep learning.