G. Sartoretti, Y. Wu, W. Paivine, S. Kumar, S. Koenig and H. Choset. Distributed Reinforcement Learning for Multi-Robot Decentralized Collective Construction. In International Symposium on Distributed Autonomous Robotics Systems (DARS), pages 35-49, 2018.

Abstract: Inspired by recent advances in single agent reinforcement learning, this paper extends the single-agent asynchronous advantage actor-critic (A3C) algorithm to enable multiple agents to learn a homogeneous, distributed policy, where agents work together toward a common goal without explicitly interacting. Our approach relies on centralized policy and critic learning, but decentralized policy execution, in a fully-observable system. We show that the sum of experience of all agents can be leveraged to quickly train a collaborative policy that naturally scales to smaller and larger swarms. We demonstrate the applicability of our method on a multi-robot construction problem, where agents need to arrange simple block elements to build a user-specified structure. We present simulation results where swarms of various sizes successfully construct different test structures without the need for additional training.

