Assured Multi-Agent Reinforcement Learning for Safety-Critical Scenarios
posterposted on 11.01.2022, 16:35 by Joshua Riley
Multi-agent reinforcement learning involves and facilitates a team of agents to solve complex decision-making problems in shared environments. This learning process is largely successful in many areas, but its inherently stochastic nature is problematic when applied to safety-critical domains.
To solve this limitation, we propose our Multi-Agent Reinforcement Learning (AMARL), which uses a model checking technique called quantitative verification. Quantitative verification provides formal guarantees of agent compliance to safety, performance, and other non-functional requirements, while reinforcement learning occurs and after a policy has been learned.
Our AMARL approach is demonstrated using three separate navigation domains, which contain patrolling problems. The multi-agent systems must learn to visit patrol points to satisfy mission objectives while limiting exposure to risky areas in these domains. Different reinforcement learning algorithms have been utilised within these domains: temporal difference learning, game theory, and direct policy search. The performance of these algorithms, while combined with our approach, are presented.
Lastly, we demonstrate AMARL with differing system sizes in both homogeneous and heterogeneous multi-agent systems through our extensive experimentation. This experimentation shows that the use of AMARL leads to faster and more efficient performance than standard reinforcement learning and consistently meets safety requirements.