Error Cascades in Collective Behavior: A Case Study of the Gradient Algorithm on 1000 Physical Agents
Type
The gradient, or hop count, algorithm is inspired by nat- ural phenomena such as the morphogen gradients present in multi-cellular development. It has several applications in multi-agent systems and sensor networks, serving as a basis for self-organized coordinate system formation, and finding shortest paths for message passing. It is a simple and well- understood algorithm in theory. However, we show here that in practice, it is highly sensitive to specific rare errors that emerge at larger scales. We implement it on a system of 1000 physical agents (Kilobot robots) that communicate asynchronously via a noisy wireless channel. We observe that spontaneous, short-lasting rare errors made by a sin- gle agent (e.g. due to message corruption) propagate spa- tially and temporally, causing cascades that severely hinder the algorithm’s functionality. We develop a mathematical model for temporal error propagation and validate it with experiments on 100 agents. This work shows how multi- agent algorithms that are believed to be simple and robust from theoretical insight may be highly challenging to im- plement on physical systems. Systematically understanding and quantifying their current limitations is a first step in the direction of improving their robustness for implementation.