Faculty
Dr. Stefano V. Albrecht
Associate Professor in Artificial Intelligence
Head of Research Group
Dr. David Abel
Honorary Fellow at University of Edinburgh
Senior Research Scientist at DeepMind
Postdoctoral Researchers
Dr. Atish Dixit
PhD in Engineering, Heriot-Watt University, 2023; MTech in Mathematical Modelling, 2017 & BEng in Mechanical Engineering, 2012, University of Pune
Project: Multi-Agent Reinforcement Learning and Ad Hoc Multi-Agent Collaboration
Research interests: reinforcement learning, multi-agent systems, robotics, intelligent control
Dr. Alper Demir
PhD in Computer Engineering, 2019; MSc in Computer Engineering, 2016; BSc in Computer Engineering, 2014, Middle East Technical University
Project: Solving Asymmetric Social Dilemmas in Multi-Agent Reinforcement Learning
Research interests: reinforcement learning, multi-agent systems, data science
Dr. Josiah P. Hanna (Jan 2019 – July 2020)
PhD in Computer Science, University of Texas at Austin, 2019; BS in Computer Science and Mathematics, University of Kentucky, 2014
Project: Towards Model Criticism in Multi-Agent Systems
Research interests: reinforcement learning, policy evaluation, robotics, autonomous driving
Dr. Ignacio Carlucho (Nov 2021 – April 2023)
PhD in Engineering, National University of Central Buenos Aires, 2019; BS in Electromechanical Engineering, National University of Central Buenos Aires, 2015
Project: Explainable Reasoning, Learning and Ad hoc Multi-agent Collaboration
Research interests: reinforcement learning, multi-agent systems, robotics, intelligent control
Dr. Cheng Wang (Oct 2022 – April 2024)
PhD in Engineering, Technical University of Darmstadt, 2021; MS in Automotive Engineering, Tongji University, 2017; BS in Automotive Engineering, Wuhan University of Technology, 2014
Project: Asking Your Autonomous Car to Explain its Decisions: Towards AI That Explains Itself
Research interests: prediction and planning, simulation and testing, safety verification and validation, autonomous vehicles
Dr. Dongge Han (Aug 2023 – May 2024)
PhD in Computer Science, University of Oxford, 2022; MSc in Computer Science, University of Oxford, 2016; BSc in Physics, The Hong Kong University of Science and Technology, 2015
Project: Multimodal Integration for Sample-Efficient Deep Reinforcement Learning
Research interests: reinforcement learning, recommender systems, multi-agent systems
PhD Research Students
Elliot Fosong
BA & MEng in Engineering, University of Cambridge, 2019
Project: Coordination of Pre-skilled Agents to Complete Unseen Tasks
Teams of autonomous agents can be trained to complete specific desirable tasks. When a new task arises, we might wish to form a new team to complete this task by selecting existing agents whose skills could be useful to solve the new task. Despite the agents' individual skills in respective roles, the newly formed team needs to learn to coordinate to solve the new task. This project aims to develop methods by which the already-skilled agents can learn to cooperate to solve a new task, given a limited number of 'trial runs' on the new task.
Lukas Schäfer
MSc Informatics, University of Edinburgh, 2019; BSc Computer Science, Saarland University, 2018
Project: Sample Efficiency and Generalisation in Multi-Agent Reinforcement Learning
Multi-agent reinforcement learning systems often require many millions of interactions to learn complex behaviour. Furthermore, the learned behaviour usually lacks generalisation ability. These challenges of sample efficiency and generalisation together severely limit the possible application of multi-agent reinforcement learning. This project will leverage distributed information and the multi-agent nature of such systems to enable agents to learn effective behaviours with less data and be able to learn robust, re-usable skills which transfer to new environments.
Mhairi Dunion
BSc (Hons) Mathematics, University of Edinburgh, 2013
Project: Causality in Deep Reinforcement Learning
A challenge of deep reinforcement learning is that it does not generalise to unseen tasks with the same underlying dynamics because it overfits to the training task. In practice, it is common to train algorithms with random initialisations of all environment variables to maximise the tasks seen during training, which is not pragmatic or sample efficient. This project will investigate novel methods to improve generalisation to unseen tasks by combining causal inference techniques with deep reinforcement learning because causal relationships remain invariant to the change in task.
Shangmin Guo
MSc Data Science, University of Edinburgh, 2019; BE Computer Science, China National University of Defense Technology, 2014
Project: Deep Iterated Learning
With the development of human society, natural languages have greatly evolved such that people can easily interpret meaning of novel phrases with the help from compositionality. During its evolution, languages have been shaped by repeated learning, interacting and transmitting, which are referred to as iterated learning. Considering language as a special kind of representation, this project aims to explore whether and how the same learning mechanism could be applied to deep learning (thus deep iterated learning) and whether it can help to improve the generalisation of multi-agent systems, and machine learning systems overall.
Samuel Garcin
MEng in Aeronautical Engineering, Imperial College London, 2018
Project: Adaptive Curriculum Design for Generalisation in Deep Reinforcement Learning
A key limitation preventing the wider adoption of Deep Reinforcement Learning (DRL) today is its difficulty generalising to environments or tasks which were not encountered during training. The frameworks put forward to tackle this issue, such as Meta Reinforcement Learning or Representation Learning, primarily focus on the DRL agent and do not act on the training task generation process. This project will investigate how Representation Learning methods may be employed to learn a structured latent representation of the problem class to be solved, enabling the generation of an adaptive task distribution that captures the problem class, and is adapted to the current level of ability of the agent.
Balint Gyevnar
MInf Informatics, University of Edinburgh, 2021
Project: Natural Language Explanations for Autonomous Vehicle Motion Planning and Prediction
Achieving trust and safety for autonomous vehicles is critical to their public success. However, most current methods rely on opaque and unaccountable black-box algorithms, making their legal and social adoption difficult. Instead, grounded in interpretable and explainable methods such as IGP2 and integrated with natural language processing and cognitive modelling, my project will investigate how to generate and deliver the most relevant and intelligible explanations for users, with the end goal of building trust and transparency in autonomous vehicles.
Trevor McInroe
MS Artificial Intelligence, Northwestern University, 2022; BBA Economics, University of North Texas, 2017
Project: Enabling Real-World Offline Reinforcement Learning with Representation Learning
The sample complexity of reinforcement learning (RL) algorithms hinders their application to real-world systems. This inefficiency is exacerbated by high-dimensional state spaces, such as those composed of pixels. My research will investigate how representation learning routines can disentangle useful information from small offline datasets of robotic tasks. My ultimate goal is to make RL feasible for the average industry group in the same way that computer vision has become accessible over the past decade.
Sabrina McCallum-Exner
MSc Artificial Intelligence, University of Strathclyde, 2022; BA Business Administration, Berlin School of Economics and Law, 2015
Project: Learning Grounded Representations from Multi-Modal Feedback and Interactions with Embodied Environments
Learning complex, hierarchical tasks or diverse, open-ended tasks when rewards are sparse or there is no clear success criterion remains a challenge for RL agents. Manually crafting dense shaping rewards is non-trivial and even potentially infeasible for some environments, and choosing a good heuristic requires domain knowledge, typically resulting in task- and environment-specific solutions. This project explores alternative approaches which instead leverage information-rich language feedback and other multi-modal signals resulting directly from interactions of embodied agents with their environment.
Raul Steleac
MSc in Computing (Artificial Intelligence and Machine Learning), Imperial College London, 2021
Project: Continual Multi-Agent Reinforcement Learning
Addressing the challenge of adaptability in multi-agent reinforcement learning, this project aims to improve agents' abilities to cooperate with diverse and changing teammates after deployment. By exposing the learner to various team configurations during training, current approaches assume that sufficient adaptability is instilled in the agent to develop a general enough cooperation capacity; however, these methods often fail to cover the entire teammate policy space in complex scenarios. This project introduces Continual Multi-Agent Reinforcement Learning to extend learning capabilities of agents beyond the initial training period, allowing them to effectively adjust to previously unseen team configurations and strategies.
Kale-ab Tessera
MSc Computer Science, University of the Witwatersrand, 2021; BSc Hons in Computer Science, University of Pretoria, 2016
Project: Scalable Coordination in Multi-Agent Reinforcement Learning
In Multi-Agent Reinforcement Learning (MARL), scalable coordination has historically been a significant challenge, limiting the real-world applicability of these algorithms. With the increasing deployment of machine learning systems, the need for scalable MARL methods that can interact with diverse agents becomes even more crucial. Our research focuses on improving the scalability of MARL algorithms, while also ensuring these algorithms can adapt to seen and unseen agents.
Elle Miller
Bachelor of Mechatronic (Space) Engineering & Bachelor of Advanced Science (Physics), University of Sydney, 2023
Project: Deep Reinforcement Learning for Safe and Compliant Human-Robot Interaction
Robots possess significant potential to enhance the quality of life for individuals with disabilities, support healthcare professionals, and provide care to an ageing population. To help with tasks such as drinking, dressing, and personal hygiene, robots will need to perform very intricate behaviours while in direct physical contact with humans. Deep Reinforcement Learning (DRL) has emerged as a promising avenue to acquire diverse complex behaviours safely through simulation. However, an open challenge lies in transferring these learned policies to real-world scenarios while ensuring safety. In this project, we investigate the potential of DRL to learn safe and compliant assistive behaviours for physical human-robot interaction.
Leonard Hinckeldey
MSc Applied Social Data Science, London School of Economics, 2023; BSc Economics, SOAS University of London, 2022
Project: Collaborative Multi-Agent Reinforcement Learning for Ad-Hoc Human-AI Teams
Multi-Agent Reinforcement Learning (MARL) holds great potential for coordinating the behaviour of artificial agents in complex real-world environments. However, MARL agents often perform poorly when partnered with agents previously unseen during training, which hampers their deployment in real-world environments that involve human agents. This project explores how MARL algorithms can learn generalisable policies, enabling effective and spontaneous collaboration between RL agents and humans.
Zhu Zheng
MSc Robotics, University of Bristol, 2021
Project: Hierarchical Structure in Multi-Agent Reinforcement Learning
A common approach for MARL is to conduct centralised training at the level of primitive actions. However, complex or sparse-reward problems often involve reasoning on multiple time scales. Enabling agents to make decisions at higher levels of abstraction remains a significant challenge. This project aims to leverage hierarchical structures, such as various levels of temporal abstraction, to enhance learning performance in multi-agent systems without relying on prior knowledge.
Ibrahim H. Ahmed (Sep 2018 – May 2022)
MS in Computer Science, UC Davis, 2018; BS in Computer Science, UC Davis, 2016
Project: Quantum-Secure Authentication and Key Agreement via Abstract Multi-Agent Interaction
Authentication and key establishment are the foundation for secure communication over computer networks. However, modern protocols which rely on public key cryptography for secure communication are vulnerable to quantum technology–based attacks. My project studies a novel quantum-safe method for authentication and key establishment based on abstract multi-agent interaction. It introduces these fields to multi-agent techniques for optimisation and rational decision-making.
Arrasy Rahman (Sep 2018 – Apr 2023)
MSc Data Science, University of Edinburgh, 2017; BSc Computer Science, Universitas Indonesia, 2015
Project: Ad Hoc Teamwork in Open Multi-Agent Systems using Graph Neural Networks
Many real-world problems require an agent to achieve specific goals by interacting with other agents, without having predefined coordination protocols with other agents. Prior work on ad hoc teamwork focused on multi-agent systems in which the number of agents is assumed fixed. My project focuses on using Graph Neural Networks (GNNs) to handle interaction data between varying number of agents. We explore the possibility of combining GNNs with Reinforcement Learning techniques to implement agents that can perform well in teams with dynamic composition
Filippos Christianos (Sep 2018 – Jun 2023)
Diploma in Electronic and Computer Engineering, Technical University of Crete, 2017
Project: Coordinated Exploration in Multi-Agent Deep Reinforcement Learning
In the increasingly large state space encountered in deep reinforcement learning, exploration plays a critical role by narrowing down the search for an optimal policy. In multi-agent settings, the joint action space also grows exponentially, further complicating the search. The use of a partially centralized policy while exploring can coordinate the exploration and more easily locate promising, even decentralized, policies. In this project, we investigate how the coordination of agents in the exploration phase can improve the performance of deep reinforcement learning algorithms.
Georgios Papoudakis (Sep 2018 – Mar 2024)
Diploma in Electrical and Computer Engineering, Aristotle University of Thessaloniki, 2017
Project: Modelling in Multi-Agent Systems Using Representation Learning
Multi-agent systems in partially observable environments face many challenging problems which traditional reinforcement learning algorithms fail to address. Agents have to deal with the lack of information about the environment's state and the opponents' beliefs and goals. A promising research direction is to learn models of the other agents to better understand their interactions. This project will investigate representation learning for opponent modelling in order to improve learning in multi-agent systems.
Cillian Brewitt (Jan 2019 – May 2023)
MSc Artificial Intelligence, University of Edinburgh, 2017; BE Electrical and Electronic Engineering, University College Cork, 2016
Project: Interpretable Planning and Prediction for Autonomous Vehicles
Accurately predicting the intentions and actions of other road users and then using this information during motion planning is an important task in the field of autonomous driving. It is desirable for planning and prediction methods to be fast, accurate, interpretable, and verifiable, however current methods fail to achieve all these objectives. During this project novel methods for prediction and planning which satisfy these objectives will be investigated. My current focus is investigating how decision trees can be used for vehicle goal recognition.
Visiting Researchers
Xuehui Yu
BE Computer Science, Harbin Engineering University, 2019
Project: Generalisation in Reinforcement Learning via Causal Inference
Reinforcement Learning (RL) has proven to be an effective tool for training agents on difficult sequential decision-making problems. Most of the early successes focus on a fixed task in a fixed environment. However, in real applications, we often have changing environments, and RL agents may struggle to generalise well due to overfitting to their training environments. Such diverse, dynamic, and unpredictable environments make great demands on RL agents to reuse experience and adapt quickly. To improve generalization to unseen tasks, this project combines causal inference techniques with RL, enabling RL agents to understand the world from a causal perspective and quickly adapt to out-of-distribution domains.
Riccardo Zamboni
MSc Automation and Control Engineering, Polytechnic University of Milan 2019; BSc Industrial Engineering, University of Trento, 2017
Project: Offline Multi-Agent Reinforcement Learning
The sample inefficiency of many multi-agent reinforcement learning (MARL) algorithms presents considerable challenges for real-world applications. Unlocking the potential of offline methods in multi-agent settings is particularly compelling. Recent findings have also highlighted inconsistencies in baselines and evaluation protocols in offline MARL, revealing that simple independent learners can often compete with state-of-the-art offline algorithms. This research will focus on how to leverage information from other agents to surpass independent learning and foster effective coordination, even when dealing with small offline datasets.
Maciej Wiatrak (Sep 2019 – Nov 2019)
BASc Mathematics & Computer Science, University College London, 2019
Project: Stabilising Generative Adversarial Networks with Multi-Agent Reinforcement Learning
Generative Adversarial Networks (GANs) are a state-of-the-art machine learning method, however, the training of GANs suffers from instability problems such as oscillatory behaviour and vanishing gradients. In this project, we outline the connections between GANs and Multi-Agent Reinforcement Learning (MARL), which is concerned with stable concurrent learning of multiple actors toward an equilibrium solution. We explore the connection between GANs and MARL, by proposing a GAN training method that utilises an established MARL technique based on variable learning rates.
Giuseppe Vecchio (Feb 2022 – April 2022)
MSc in Computer Engineering, UNICT, 2020; BSc in Computer Engineering, UNICT, 2018
Project: Imitation Learning for Autonomous Robot Navigation in Unstructured Environments
Mobile robots have become part of everyday life in many forms like service robots, planetary rovers, autonomous cars and more. For ground robots the ability to navigate the surrounding environment is essential for long-term operations in real-world scenarios and is highly dependent on the ability to quickly adapt to new unseen settings. This project will explore the use of Imitation Learning-based techniques in simulation environments for autonomous navigation in unstructured settings and the use of domain adaptation approaches for real-world applications.
Alain Andres Fernandez (June 2022 – Aug 2022)
MSc Telecommunication Engineering, University of the Basque Country, 2019; BSc Telecommunication Engineering, University of the Basque Country, 2017
Project: Exploration under Sparse Rewards with Deep Reinforcement Learning
Reinforcement learning algorithms are highly dependable on the feedback signals (rewards) provided by the environment. Unfortunately, the design of a suitable reward function is not always trivial and the adoption of sparse signals is a common choice to determine whether the task has been accomplished. However, such sparse reward signals lead to difficult exploration challenges. This project will investigate the adoption of Imitation Learning and Intrinsic Motivation solutions to enhance exploration and improve sample efficiency in challenging sparse-reward environments.
Mahdi Kazemi Moghaddam (Dec 2022 – Feb 2023)
Hons. Degree of Bachelor of Computer Science, University of Adelaide, 2019; BSc Electrical Engineering, Electronics, Amirkabir University of Technology, 2015
Project: Fairness and Social Welfare in Autonomous Driving
Most of the current autonomous driving approaches based on reinforcement learning focus on maximising some notion of performance (e.g. travel time) or incorporating driving safety. As such, less attention has been paid to the social welfare and fairness aspects of the interaction of such trained policies with other road users. In this project, we aim to train a population of autonomous vehicles which can achieve high individual performance while cooperating with other agents to maximise the overall social welfare and fairness.
Bram Renting (April 2024 – June 2024)
MSc Embedded Systems, Delft University of Technology, 2019; BSc Marine Technology, Delft University of Technology, 2016
Project: Multi-Agent Negotiation for Scheduling Human-Robot Cooperation in Warehouses
In warehouses where humans and robots are interdependent, precise coordination is required to reduce idle time and improve output. Traditional centralised coordination algorithms are complex and costly to design and do not transfer to other warehouses. Decentralised learning methods could alleviate some of the complexity and improve transferability. This project takes a multi-agent approach, where agents represent humans and robots. The agents plan routes to task locations and negotiate with other agents to schedule meetings to perform the task.