Publications
For news about publications, follow us on X:
Click on any author names or tags to filter publications.
All topic tags:
surveydeep-rlmulti-agent-rlagent-modellingad-hoc-teamworkautonomous-drivinggoal-recognitionexplainable-aicausalgeneralisationsecurityemergent-communicationiterated-learningintrinsic-rewardsimulatorstate-estimationdeep-learningtransfer-learning
Selected tags (click to remove):
TMLRFilippos-Christianos
2023
Filippos Christianos, Georgios Papoudakis, Stefano V. Albrecht
Pareto Actor-Critic for Equilibrium Selection in Multi-Agent Reinforcement Learning
Transactions on Machine Learning Research, 2023
Abstract | BibTex | arXiv | Code
TMLRdeep-rlmulti-agent-rl
Abstract:
This work focuses on equilibrium selection in no-conflict multi-agent games, where we specifically study the problem of selecting a Pareto-optimal Nash equilibrium among several existing equilibria. It has been shown that many state-of-the-art multi-agent reinforcement learning (MARL) algorithms are prone to converging to Pareto-dominated equilibria due to the uncertainty each agent has about the policy of the other agents during training. To address sub-optimal equilibrium selection, we propose Pareto Actor-Critic (Pareto-AC), which is an actor-critic algorithm that utilises a simple property of no-conflict games (a superset of cooperative games): the Pareto-optimal equilibrium in a no-conflict game maximises the returns of all agents and, therefore, is the preferred outcome for all agents. We evaluate Pareto-AC in a diverse set of multi-agent games and show that it converges to higher episodic returns compared to seven state-of-the-art MARL algorithms and that it successfully converges to a Pareto-optimal equilibrium in a range of matrix games. Finally, we propose PACDCG, a graph neural network extension of Pareto-AC, which is shown to efficiently scale in games with a large number of agents.
@article{christianos2023pareto,
title={Pareto Actor-Critic for Equilibrium Selection in Multi-Agent Reinforcement Learning},
author={Filippos Christianos and Georgios Papoudakis and Stefano V. Albrecht},
journal={Transactions on Machine Learning Research (TMLR)},
year={2023}
}