Publications
For news about publications, follow us on X:
Click on any author names or tags to filter publications.
All topic tags:
surveydeep-rlmulti-agent-rlagent-modellingad-hoc-teamworkautonomous-drivinggoal-recognitionexplainable-aicausalgeneralisationsecurityemergent-communicationiterated-learningintrinsic-rewardsimulatorstate-estimationdeep-learningtransfer-learning
Selected tags (click to remove):
intrinsic-reward
2022
Lukas Schäfer, Filippos Christianos, Josiah P. Hanna, Stefano V. Albrecht
Decoupled Reinforcement Learning to Stabilise Intrinsically-Motivated Exploration
International Conference on Autonomous Agents and Multi-Agent Systems, 2022
Abstract | BibTex | arXiv | Code
AAMASdeep-rlintrinsic-reward
Abstract:
Intrinsic rewards can improve exploration in reinforcement learning, but the exploration process may suffer from instability caused by non-stationary reward shaping and strong dependency on hyperparameters. In this work, we introduce Decoupled RL (DeRL) as a general framework which trains separate policies for intrinsically-motivated exploration and exploitation. Such decoupling allows DeRL to leverage the benefits of intrinsic rewards for exploration while demonstrating improved robustness and sample efficiency. We evaluate DeRL algorithms in two sparse-reward environments with multiple types of intrinsic rewards. Our results show that DeRL is more robust to varying scale and rate of decay of intrinsic rewards and converges to the same evaluation returns than intrinsically-motivated baselines in fewer interactions. Lastly, we discuss the challenge of distribution shift and show that divergence constraint regularisers can successfully minimise instability caused by divergence of exploration and exploitation policies.
@inproceedings{schaefer2022derl,
title={Decoupled Reinforcement Learning to Stabilise Intrinsically-Motivated Exploration},
author={Lukas Schäfer and Filippos Christianos and Josiah P. Hanna and Stefano V. Albrecht},
booktitle={International Conference on Autonomous Agents and Multiagent Systems (AAMAS)},
year={2022}
}
2021
Lukas Schäfer, Filippos Christianos, Josiah Hanna, Stefano V. Albrecht
Decoupling Exploration and Exploitation in Reinforcement Learning
ICML Workshop on Unsupervised Reinforcement Learning, 2021
Abstract | BibTex | arXiv | Code
ICMLdeep-rlintrinsic-reward
Abstract:
Intrinsic rewards are commonly applied to improve exploration in reinforcement learning. However, these approaches suffer from instability caused by non-stationary reward shaping and strong dependency on hyperparameters. In this work, we propose Decoupled RL (DeRL) which trains separate policies for exploration and exploitation. DeRL can be applied with on-policy and off-policy RL algorithms. We evaluate DeRL algorithms in two sparse-reward environments with multiple types of intrinsic rewards. We show that DeRL is more robust to scaling and speed of decay of intrinsic rewards and converges to the same evaluation returns than intrinsically motivated baselines in fewer interactions.
@inproceedings{schaefer2021decoupling,
title={Decoupling Exploration and Exploitation in Reinforcement Learning},
author={Lukas Schäfer and Filippos Christianos and Josiah Hanna and Stefano V. Albrecht},
booktitle={ICML Workshop on Unsupervised Reinforcement Learning (URL)},
year={2021}
}