Publications
For news about publications, follow us on X:
Click on any author names or tags to filter publications.
All topic tags:
surveydeep-rlmulti-agent-rlagent-modellingad-hoc-teamworkautonomous-drivinggoal-recognitionexplainable-aicausalgeneralisationsecurityemergent-communicationiterated-learningintrinsic-rewardsimulatorstate-estimationdeep-learningtransfer-learning
Selected tags (click to remove):
Subramanian-Ramamoorthy
2024
Anthony Knittel, Majd Hawasly, Stefano V. Albrecht, John Redford, Subramanian Ramamoorthy
DiPA: Probabilistic Multi-Modal Interactive Prediction for Autonomous Driving
IEEE International Conference on Robotics and Automation, 2024
Abstract | BibTex | arXiv | Publisher
ICRAautonomous-drivingstate-estimation
Abstract:
Accurate prediction is important for operating an autonomous vehicle in
interactive scenarios. Prediction must be fast, to support multiple
requests from a planner exploring a range of possible futures. The
generated predictions must accurately represent the probabilities of
predicted trajectories, while also capturing different modes of
behaviour (such as turning left vs continuing straight at a junction).
To this end, we present DiPA, an interactive predictor that addresses
these challenging requirements. Previous interactive prediction methods
use an encoding of k-mode-samples, which under-represents the full
distribution. Other methods optimise closest-mode evaluations, which
test whether one of the predictions is similar to the ground-truth, but
allow additional unlikely predictions to occur, over-representing
unlikely predictions. DiPA addresses these limitations by using a
Gaussian-Mixture-Model to encode the full distribution, and optimising
predictions using both probabilistic and closest-mode measures. These
objectives respectively optimise probabilistic accuracy and the ability
to capture distinct behaviours, and there is a challenging trade-off
between them. We are able to solve both together using a novel training
regime. DiPA achieves new state-of-the-art performance on the
INTERACTION and NGSIM datasets, and improves over the baseline (MFP)
when both closest-mode and probabilistic evaluations are used. This
demonstrates effective prediction for supporting a planner on
interactive scenarios.
@article{Knittel2023dipa,
title={{DiPA:} Probabilistic Multi-Modal Interactive Prediction for Autonomous Driving},
author={Anthony Knittel and Majd Hawasly and Stefano V. Albrecht and John Redford and Subramanian Ramamoorthy},
journal={IEEE Robotics and Automation Letters},
volume={8},
number={8},
pages={4887--4894},
year={2023}
}
2023
Anthony Knittel, Majd Hawasly, Stefano V. Albrecht, John Redford, Subramanian Ramamoorthy
DiPA: Probabilistic Multi-Modal Interactive Prediction for Autonomous Driving
IEEE Robotics and Automation Letters, 2023
Abstract | BibTex | arXiv | Publisher
RA-Lautonomous-drivingstate-estimation
Abstract:
Accurate prediction is important for operating an autonomous vehicle in
interactive scenarios. Prediction must be fast, to support multiple
requests from a planner exploring a range of possible futures. The
generated predictions must accurately represent the probabilities of
predicted trajectories, while also capturing different modes of
behaviour (such as turning left vs continuing straight at a junction).
To this end, we present DiPA, an interactive predictor that addresses
these challenging requirements. Previous interactive prediction methods
use an encoding of k-mode-samples, which under-represents the full
distribution. Other methods optimise closest-mode evaluations, which
test whether one of the predictions is similar to the ground-truth, but
allow additional unlikely predictions to occur, over-representing
unlikely predictions. DiPA addresses these limitations by using a
Gaussian-Mixture-Model to encode the full distribution, and optimising
predictions using both probabilistic and closest-mode measures. These
objectives respectively optimise probabilistic accuracy and the ability
to capture distinct behaviours, and there is a challenging trade-off
between them. We are able to solve both together using a novel training
regime. DiPA achieves new state-of-the-art performance on the
INTERACTION and NGSIM datasets, and improves over the baseline (MFP)
when both closest-mode and probabilistic evaluations are used. This
demonstrates effective prediction for supporting a planner on
interactive scenarios.
@article{Knittel2023dipa,
title={{DiPA:} Probabilistic Multi-Modal Interactive Prediction for Autonomous Driving},
author={Anthony Knittel and Majd Hawasly and Stefano V. Albrecht and John Redford and Subramanian Ramamoorthy},
journal={IEEE Robotics and Automation Letters},
volume={8},
number={8},
pages={4887--4894},
year={2023}
}
2022
Majd Hawasly, Jonathan Sadeghi, Morris Antonello, Stefano V. Albrecht, John Redford, Subramanian Ramamoorthy
Perspectives on the System-level Design of a Safe Autonomous Driving Stack
AI Communications, 2022
Abstract | BibTex | arXiv | Publisher
AICsurveyautonomous-drivinggoal-recognitionexplainable-ai
Abstract:
Achieving safe and robust autonomy is the key bottleneck on the path towards broader adoption of autonomous vehicles technology. This motivates going beyond extrinsic metrics such as miles between disengagement, and calls for approaches that embody safety by design. In this paper, we address some aspects of this challenge, with emphasis on issues of motion planning and prediction. We do this through description of novel approaches taken to solving selected sub-problems within an autonomous driving stack, in the process introducing the design philosophy being adopted within Five. This includes safe-by-design planning, interpretable as well as verifiable prediction, and modelling of perception errors to enable effective sim-to-real and real-to-sim transfer within the testing pipeline of a realistic autonomous system.
@article{albrecht2022aic,
author = {Majd Hawasly and Jonathan Sadeghi and Morris Antonello and Stefano V. Albrecht and John Redford and Subramanian Ramamoorthy},
title = {Perspectives on the System-level Design of a Safe Autonomous Driving Stack},
journal = {AI Communications, Special Issue on Multi-Agent Systems Research in the UK},
year = {2022}
}
Francisco Eiras, Majd Hawasly, Stefano V. Albrecht, Subramanian Ramamoorthy
A Two-Stage Optimization-based Motion Planner for Safe Urban Driving
IEEE Transactions on Robotics, 2022
Abstract | BibTex | arXiv | Publisher | Video
T-ROautonomous-driving
Abstract:
Recent road trials have shown that guaranteeing the safety of driving decisions is essential for the wider adoption of autonomous vehicle technology. One promising direction is to pose safety requirements as planning constraints in nonlinear, non-convex optimization problems of motion synthesis. However, many implementations of this approach are limited by uncertain convergence and local optimality of the solutions achieved, affecting overall robustness. To improve upon these issues, we propose a novel two-stage optimization framework: in the first stage, we find a solution to a Mixed-Integer Linear Programming (MILP) formulation of the motion synthesis problem, the output of which initializes a second Nonlinear Programming (NLP) stage. The MILP stage enforces hard constraints of safety and road rule compliance generating a solution in the right subspace, while the NLP stage refines the solution within the safety bounds for feasibility and smoothness. We demonstrate the effectiveness of our framework via simulated experiments of complex urban driving scenarios, outperforming a state-of-the-art baseline in metrics of convergence, comfort and progress.
@article{eiras2021twostage,
title = {A Two-Stage Optimization-based Motion Planner for Safe Urban Driving},
author = {Francisco Eiras and Majd Hawasly and Stefano V. Albrecht and Subramanian Ramamoorthy},
journal = {IEEE Transactions on Robotics},
volume = {38},
number = {2},
pages = {822--834},
year = {2022},
doi = {10.1109/TRO.2021.3088009}
}
Morris Antonello, Mihai Dobre, Stefano V. Albrecht, John Redford, Subramanian Ramamoorthy
Flash: Fast and Light Motion Prediction for Autonomous Driving with Bayesian Inverse Planning and Learned Motion Profiles
IEEE/RSJ International Conference on Intelligent Robots and Systems, 2022
Abstract | BibTex | arXiv
IROSautonomous-drivingstate-estimation
Abstract:
Motion prediction of road users in traffic scenes is critical for autonomous driving systems that must take safe and robust decisions in complex dynamic environments. We present a novel motion prediction system for autonomous driving. Our system is based on the Bayesian inverse planning framework, which efficiently orchestrates map-based goal extraction, a classical control-based trajectory generator and an ensemble of light-weight neural networks specialised in motion profile prediction. In contrast to many alternative methods, this modularity helps isolate performance factors and better interpret results, without compromising performance. This system addresses multiple aspects of interest, namely multi-modality, motion profile uncertainty and trajectory physical feasibility. We report on several experiments with the popular highway dataset NGSIM, demonstrating state-of-the-art performance in terms of trajectory error. We also perform a detailed analysis of our system's components, along with experiments that stratify the data based on behaviours, such as change lane versus follow lane, to provide insights into the challenges in this domain. Finally, we present a qualitative analysis to show other benefits of our approach, such as the ability to interpret the outputs.
@inproceedings{antonello2022flash,
title={Flash: Fast and Light Motion Prediction for Autonomous Driving with {Bayesian} Inverse Planning and Learned Motion Profiles},
author={Morris Antonello, Mihai Dobre, Stefano V. Albrecht, John Redford, Subramanian Ramamoorthy},
booktitle={IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
year={2022}
}
Anthony Knittel, Majd Hawasly, Stefano V. Albrecht, John Redford, Subramanian Ramamoorthy
DiPA: Diverse and Probabilistically Accurate Interactive Prediction
arXiv:2210.06106, 2022
Abstract | BibTex | arXiv
autonomous-drivingstate-estimation
Abstract:
Accurate prediction is important for operating an autonomous vehicle in interactive scenarios. Previous interactive predictors have used closest-mode evaluations, which test if one of a set of predictions covers the ground-truth, but not if additional unlikely predictions are made. The presence of unlikely predictions can interfere with planning, by indicating conflict with the ego plan when it is not likely to occur. Closest-mode evaluations are not sufficient for showing a predictor is useful, an effective predictor also needs to accurately estimate mode probabilities, and to be evaluated using probabilistic measures. These two evaluation approaches, eg. predicted-mode RMS and minADE/FDE, are analogous to precision and recall in binary classification, and there is a challenging trade-off between prediction strategies for each. We present DiPA, a method for producing diverse predictions while also capturing accurate probabilistic estimates. DiPA uses a flexible representation that captures interactions in widely varying road topologies, and uses a novel training regime for a Gaussian Mixture Model that supports diversity of predicted modes, along with accurate spatial distribution and mode probability estimates. DiPA achieves state-of-the-art performance on INTERACTION and NGSIM, and improves over a baseline (MFP) when both closest-mode and probabilistic evaluations are used at the same time.
@misc{brewitt2022verifiable,
title={{DiPA:} Diverse and Probabilistically Accurate Interactive Prediction},
author={Anthony Knittel and Majd Hawasly and Stefano V. Albrecht and John Redford and Subramanian Ramamoorthy},
year={2022},
eprint={2210.06106},
archivePrefix={arXiv},
primaryClass={cs.RO}
}
2021
Stefano V. Albrecht, Cillian Brewitt, John Wilhelm, Balint Gyevnar, Francisco Eiras, Mihai Dobre, Subramanian Ramamoorthy
Interpretable Goal-based Prediction and Planning for Autonomous Driving
IEEE International Conference on Robotics and Automation, 2021
Abstract | BibTex | arXiv | Video | Code
ICRAautonomous-drivinggoal-recognitionexplainable-ai
Abstract:
We propose an integrated prediction and planning system for autonomous driving which uses rational inverse planning to recognise the goals of other vehicles. Goal recognition informs a Monte Carlo Tree Search (MCTS) algorithm to plan optimal maneuvers for the ego vehicle. Inverse planning and MCTS utilise a shared set of defined maneuvers and macro actions to construct plans which are explainable by means of rationality principles. Evaluation in simulations of urban driving scenarios demonstrate the system's ability to robustly recognise the goals of other vehicles, enabling our vehicle to exploit non-trivial opportunities to significantly reduce driving times. In each scenario, we extract intuitive explanations for the predictions which justify the system's decisions.
@inproceedings{albrecht2020igp2,
title={Interpretable Goal-based Prediction and Planning for Autonomous Driving},
author={Stefano V. Albrecht and Cillian Brewitt and John Wilhelm and Balint Gyevnar and Francisco Eiras and Mihai Dobre and Subramanian Ramamoorthy},
booktitle={IEEE International Conference on Robotics and Automation (ICRA)},
year={2021}
}
Josiah P. Hanna, Arrasy Rahman, Elliot Fosong, Francisco Eiras, Mihai Dobre, John Redford, Subramanian Ramamoorthy, Stefano V. Albrecht
Interpretable Goal Recognition in the Presence of Occluded Factors for Autonomous Vehicles
IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021
Abstract | BibTex | arXiv
IROSautonomous-drivinggoal-recognitionexplainable-ai
Abstract:
Recognising the goals or intentions of observed vehicles is a key step towards predicting the long-term future behaviour of other agents in an autonomous driving scenario. When there are unseen obstacles or occluded vehicles in a scenario, goal recognition may be confounded by the effects of these unseen entities on the behaviour of observed vehicles. Existing prediction algorithms that assume rational behaviour with respect to inferred goals may fail to make accurate long-horizon predictions because they ignore the possibility that the behaviour is influenced by such unseen entities. We introduce the Goal and Occluded Factor Inference (GOFI) algorithm which bases inference on inverse-planning to jointly infer a probabilistic belief over goals and potential occluded factors. We then show how these beliefs can be integrated into Monte Carlo Tree Search (MCTS). We demonstrate that jointly inferring goals and occluded factors leads to more accurate beliefs with respect to the true world state and allows an agent to safely navigate several scenarios where other baselines take unsafe actions leading to collisions.
@inproceedings{hanna2021interpretable,
title={Interpretable Goal Recognition in the Presence of Occluded Factors for Autonomous Vehicles},
author={Josiah P. Hanna and Arrasy Rahman and Elliot Fosong and Francisco Eiras and Mihai Dobre and John Redford and Subramanian Ramamoorthy and Stefano V. Albrecht},
booktitle={IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
year={2021}
}
Henry Pulver, Francisco Eiras, Ludovico Carozza, Majd Hawasly, Stefano V. Albrecht, Subramanian Ramamoorthy
PILOT: Efficient Planning by Imitation Learning and Optimisation for Safe Autonomous Driving
IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021
Abstract | BibTex | arXiv | Video
IROSautonomous-driving
Abstract:
Achieving a proper balance between planning quality, safety and efficiency is a major challenge for autonomous driving. Optimisation-based motion planners are capable of producing safe, smooth and comfortable plans, but often at the cost of runtime efficiency. On the other hand, naively deploying trajectories produced by efficient-to-run deep imitation learning approaches might risk compromising safety. In this paper, we present PILOT -- a planning framework that comprises an imitation neural network followed by an efficient optimiser that actively rectifies the network's plan, guaranteeing fulfilment of safety and comfort requirements. The objective of the efficient optimiser is the same as the objective of an expensive-to-run optimisation-based planning system that the neural network is trained offline to imitate. This efficient optimiser provides a key layer of online protection from learning failures or deficiency in out-of-distribution situations that might compromise safety or comfort. Using a state-of-the-art, runtime-intensive optimisation-based method as the expert, we demonstrate in simulated autonomous driving experiments in CARLA that PILOT achieves a seven-fold reduction in runtime when compared to the expert it imitates without sacrificing planning quality.
@inproceedings{pulver2020pilot,
title={{PILOT:} Efficient Planning by Imitation Learning and Optimisation for Safe Autonomous Driving},
author={Henry Pulver and Francisco Eiras and Ludovico Carozza and Majd Hawasly and Stefano V. Albrecht and Subramanian Ramamoorthy},
booktitle={IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
year={2021}
}
2020
Stefano V. Albrecht, Cillian Brewitt, John Wilhelm, Balint Gyevnar, Francisco Eiras, Mihai Dobre, Subramanian Ramamoorthy
Interpretable Goal-based Prediction and Planning for Autonomous Driving
arXiv:2002.02277, 2020
Abstract | BibTex | arXiv
autonomous-drivinggoal-recognitionexplainable-ai
Abstract:
We propose an integrated prediction and planning system for autonomous driving which uses rational inverse planning to recognise the goals of other vehicles. Goal recognition informs a Monte Carlo Tree Search (MCTS) algorithm to plan optimal maneuvers for the ego vehicle. Inverse planning and MCTS utilise a shared set of defined maneuvers and macro actions to construct plans which are explainable by means of rationality principles. Evaluation in simulations of urban driving scenarios demonstrate the system's ability to robustly recognise the goals of other vehicles, enabling our vehicle to exploit non-trivial opportunities to significantly reduce driving times. In each scenario, we extract intuitive explanations for the predictions which justify the system's decisions.
@misc{albrecht2020integrating,
title={Interpretable Goal-based Prediction and Planning for Autonomous Driving},
author={Stefano V. Albrecht and Cillian Brewitt and John Wilhelm and Balint Gyevnar and Francisco Eiras and Mihai Dobre and Subramanian Ramamoorthy},
year={2020},
eprint={2002.02277},
archivePrefix={arXiv},
primaryClass={cs.RO}
}
Henry Pulver, Francisco Eiras, Ludovico Carozza, Majd Hawasly, Stefano V. Albrecht, Subramanian Ramamoorthy
PILOT: Efficient Planning by Imitation Learning and Optimisation for Safe Autonomous Driving
arXiv:2011.00509, 2020
Abstract | BibTex | arXiv
autonomous-driving
Abstract:
Achieving the right balance between planning quality, safety and runtime efficiency is a major challenge for autonomous driving research. Optimisation-based planners are typically capable of producing high-quality, safe plans, but at the cost of efficiency. We present PILOT, a two-stage planning framework comprising an imitation neural network and an efficient optimisation component that guarantees the satisfaction of requirements of safety and comfort. The neural network is trained to imitate an expensive-to-run optimisation-based planning system with the same objective as the efficient optimisation component of PILOT. We demonstrate in simulated autonomous driving experiments that the proposed framework achieves a significant reduction in runtime when compared to the optimisation-based expert it imitates, without sacrificing the planning quality.
@misc{pulver2020pilot,
title={{PILOT:} Efficient Planning by Imitation Learning and Optimisation for Safe Autonomous Driving},
author={Henry Pulver and Francisco Eiras and Ludovico Carozza and Majd Hawasly and Stefano V. Albrecht and Subramanian Ramamoorthy},
year={2020},
eprint={2011.00509},
archivePrefix={arXiv},
primaryClass={cs.RO}
}
Francisco Eiras, Majd Hawasly, Stefano V. Albrecht, Subramanian Ramamoorthy
Two-Stage Optimization-based Motion Planner for Safe Urban Driving
arXiv:2002.02215, 2020
Abstract | BibTex | arXiv
autonomous-driving
Abstract:
Recent road trials have shown that guaranteeing the safety of driving decisions is essential for the wider adoption of autonomous vehicle technology. One promising direction is to pose safety requirements as planning constraints in nonlinear, nonconvex optimization problems of motion synthesis. However, many implementations of this approach are limited by uncertain convergence and local optimality of the solutions achieved, affecting overall robustness. To improve upon these issues, we propose a novel two-stage optimization framework: in the first stage, we find a solution to a Mixed-Integer Linear Programming (MILP) formulation of the motion synthesis problem, the output of which initializes a second Nonlinear Programming (NLP) stage. The MILP stage enforces hard constraints of safety and road rule compliance generating a solution in the right subspace, while the NLP stage refines the solution within the safety bounds for feasibility and smoothness. We demonstrate the effectiveness of our framework via simulated experiments of complex urban driving scenarios, outperforming a state-of-the-art baseline in metrics of convergence, comfort and progress.
@misc{eiras2020twostage,
title={Two-Stage Optimization-based Motion Planner for Safe Urban Driving},
author={Francisco Eiras and Majd Hawasly and Stefano V. Albrecht and Subramanian Ramamoorthy},
year={2020},
eprint={2002.02215},
archivePrefix={arXiv},
primaryClass={cs.RO}
}
2018
Craig Innes, Alex Lascarides, Stefano V. Albrecht, Subramanian Ramamoorthy, Benjamin Rosman
Reasoning about Unforeseen Possibilities During Policy Learning
arXiv:1801.03331, 2018
Abstract | BibTex | arXiv
causal
Abstract:
Methods for learning optimal policies in autonomous agents often assume that the way the domain is conceptualised - its possible states and actions and their causal structure - is known in advance and does not change during learning. This is an unrealistic assumption in many scenarios, because new evidence can reveal important information about what is possible, possibilities that the agent was not aware existed prior to learning. We present a model of an agent which both discovers and learns to exploit unforeseen possibilities using two sources of evidence: direct interaction with the world and communication with a domain expert. We use a combination of probabilistic and symbolic reasoning to estimate all components of the decision problem, including its set of random variables and their causal dependencies. Agent simulations show that the agent converges on optimal polices even when it starts out unaware of factors that are critical to behaving optimally.
@misc{innes2018reasoning,
title={Reasoning about Unforeseen Possibilities During Policy Learning},
author={Craig Innes and Alex Lascarides and Stefano V. Albrecht and Subramanian Ramamoorthy and Benjamin Rosman},
year={2018},
eprint={1801.03331},
archivePrefix={arXiv},
primaryClass={cs.AI}
}
2017
Stefano V. Albrecht, Subramanian Ramamoorthy
Exploiting Causality for Selective Belief Filtering in Dynamic Bayesian Networks (Extended Abstract)
International Joint Conference on Artificial Intelligence, 2017
Abstract | BibTex | arXiv
IJCAIstate-estimationcausal
Abstract:
Dynamic Bayesian networks (DBNs) are a general model for stochastic processes with partially observed states. Belief filtering in DBNs is the task of inferring the belief state (i.e. the probability distribution over process states) based on incomplete and uncertain observations. In this article, we explore the idea of accelerating the filtering task by automatically exploiting causality in the process. We consider a specific type of causal relation, called passivity, which pertains to how state variables cause changes in other variables. We present the Passivity-based Selective Belief Filtering (PSBF) method, which maintains a factored belief representation and exploits passivity to perform selective updates over the belief factors. PSBF is evaluated in both synthetic processes and a simulated multi-robot warehouse, where it outperformed alternative filtering methods by exploiting passivity.
@inproceedings{ albrecht2017causality,
title = {Exploiting Causality for Selective Belief Filtering in Dynamic {B}ayesian Networks (Extended Abstract)},
author = {Stefano V. Albrecht and Subramanian Ramamoorthy},
booktitle = {Proceedings of the 26th International Joint Conference on Artificial Intelligence},
address = {Melbourne, Australia},
month = {August},
year = {2017}
}
2016
Stefano V. Albrecht, Jacob W. Crandall, Subramanian Ramamoorthy
Belief and Truth in Hypothesised Behaviours
Artificial Intelligence, 2016
Abstract | BibTex | arXiv | Publisher
AIJagent-modellingad-hoc-teamwork
Abstract:
There is a long history in game theory on the topic of Bayesian or “rational” learning, in which each player maintains beliefs over a set of alternative behaviours, or types, for the other players. This idea has gained increasing interest in the artificial intelligence (AI) community, where it is used as a method to control a single agent in a system composed of multiple agents with unknown behaviours. The idea is to hypothesise a set of types, each specifying a possible behaviour for the other agents, and to plan our own actions with respect to those types which we believe are most likely, given the observed actions of the agents. The game theory literature studies this idea primarily in the context of equilibrium attainment. In contrast, many AI applications have a focus on task completion and payoff maximisation. With this perspective in mind, we identify and address a spectrum of questions pertaining to belief and truth in hypothesised types. We formulate three basic ways to incorporate evidence into posterior beliefs and show when the resulting beliefs are correct, and when they may fail to be correct. Moreover, we demonstrate that prior beliefs can have a significant impact on our ability to maximise payoffs in the long-term, and that they can be computed automatically with consistent performance effects. Furthermore, we analyse the conditions under which we are able complete our task optimally, despite inaccuracies in the hypothesised types. Finally, we show how the correctness of hypothesised types can be ascertained during the interaction via an automated statistical analysis.
@article{ albrecht2016belief,
title = {Belief and Truth in Hypothesised Behaviours},
author = {Stefano V. Albrecht and Jacob W. Crandall and Subramanian Ramamoorthy},
journal = {Artificial Intelligence},
volume = {235},
pages = {63--94},
year = {2016},
publisher = {Elsevier},
note = {DOI: 10.1016/j.artint.2016.02.004}
}
Stefano V. Albrecht, Subramanian Ramamoorthy
Exploiting Causality for Selective Belief Filtering in Dynamic Bayesian Networks
Journal of Artificial Intelligence Research, 2016
Abstract | BibTex | arXiv | Publisher
JAIRstate-estimationcausal
Abstract:
Dynamic Bayesian networks (DBNs) are a general model for stochastic processes with partially observed states. Belief filtering in DBNs is the task of inferring the belief state (i.e. the probability distribution over process states) based on incomplete and noisy observations. This can be a hard problem in complex processes with large state spaces. In this article, we explore the idea of accelerating the filtering task by automatically exploiting causality in the process. We consider a specific type of causal relation, called passivity, which pertains to how state variables cause changes in other variables. We present the Passivity-based Selective Belief Filtering (PSBF) method, which maintains a factored belief representation and exploits passivity to perform selective updates over the belief factors. PSBF produces exact belief states under certain assumptions and approximate belief states otherwise, where the approximation error is bounded by the degree of uncertainty in the process. We show empirically, in synthetic processes with varying sizes and degrees of passivity, that PSBF is faster than several alternative methods while achieving competitive accuracy. Furthermore, we demonstrate how passivity occurs naturally in a complex system such as a multi-robot warehouse, and how PSBF can exploit this to accelerate the filtering task.
@article{ albrecht2016causality,
title = {Exploiting Causality for Selective Belief Filtering in Dynamic {B}ayesian Networks},
author = {Stefano V. Albrecht and Subramanian Ramamoorthy},
journal = {Journal of Artificial Intelligence Research},
volume = {55},
pages = {1135--1178},
year = {2016},
publisher = {AI Access Foundation},
note = {DOI: 10.1613/jair.5044}
}
2015
Stefano V. Albrecht, Subramanian Ramamoorthy
Are You Doing What I Think You Are Doing? Criticising Uncertain Agent Models
Conference on Uncertainty in Artificial Intelligence, 2015
Abstract | BibTex | arXiv
UAIagent-modelling
Abstract:
The key for effective interaction in many multiagent applications is to reason explicitly about the behaviour of other agents, in the form of a hypothesised behaviour. While there exist several methods for the construction of a behavioural hypothesis, there is currently no universal theory which would allow an agent to contemplate the correctness of a hypothesis. In this work, we present a novel algorithm which decides this question in the form of a frequentist hypothesis test. The algorithm allows for multiple metrics in the construction of the test statistic and learns its distribution during the interaction process, with asymptotic correctness guarantees. We present results from a comprehensive set of experiments, demonstrating that the algorithm achieves high accuracy and scalability at low computational costs.
@inproceedings{ albrecht2015criticising,
title = {Are You Doing What {I} Think You Are Doing? Criticising Uncertain Agent Models},
author = {Stefano V. Albrecht and Subramanian Ramamoorthy},
booktitle = {Proceedings of the 31st Conference on Uncertainty in Artificial Intelligence},
pages = {52--61},
year = {2015}
}
Stefano V. Albrecht, Jacob W. Crandall, Subramanian Ramamoorthy
An Empirical Study on the Practical Impact of Prior Beliefs over Policy Types
AAAI Conference on Artificial Intelligence, 2015
Abstract | BibTex | arXiv | Appendix
AAAIagent-modellingad-hoc-teamwork
Abstract:
Many multiagent applications require an agent to learn quickly how to interact with previously unknown other agents. To address this problem, researchers have studied learning algorithms which compute posterior beliefs over a hypothesised set of policies, based on the observed actions of the other agents. The posterior belief is complemented by the prior belief, which specifies the subjective likelihood of policies before any actions are observed. In this paper, we present the first comprehensive empirical study on the practical impact of prior beliefs over policies in repeated interactions. We show that prior beliefs can have a significant impact on the long-term performance of such methods, and that the magnitude of the impact depends on the depth of the planning horizon. Moreover, our results demonstrate that automatic methods can be used to compute prior beliefs with consistent performance effects. This indicates that prior beliefs could be eliminated as a manual parameter and instead be computed automatically.
@inproceedings{ albrecht2015empirical,
title = {An Empirical Study on the Practical Impact of Prior Beliefs over Policy Types},
author = {Stefano V. Albrecht and Jacob W. Crandall and Subramanian Ramamoorthy},
booktitle = {Proceedings of the 29th AAAI Conference on Artificial Intelligence},
pages = {1988--1994},
year = {2015}
}
Stefano V. Albrecht, Jacob W. Crandall, Subramanian Ramamoorthy
E-HBA: Using Action Policies for Expert Advice and Agent Typification
AAAI Workshop on Multiagent Interaction without Prior Coordination, 2015
Abstract | BibTex | arXiv | Appendix
AAAIagent-modellingad-hoc-teamwork
Abstract:
Past research has studied two approaches to utilise predefined policy sets in repeated interactions: as experts, to dictate our own actions, and as types, to characterise the behaviour of other agents. In this work, we bring these complementary views together in the form of a novel meta-algorithm, called Expert-HBA (E-HBA), which can be applied to any expert algorithm that considers the average (or total) payoff an expert has yielded in the past. E-HBA gradually mixes the past payoff with a predicted future payoff, which is computed using the type-based characterisation. We present results from a comprehensive set of repeated matrix games, comparing the performance of several well-known expert algorithms with and without the aid of E-HBA. Our results show that E-HBA has the potential to significantly improve the performance of expert algorithms.
@inproceedings{ albrecht2015ehba,
title = {{E-HBA}: Using Action Policies for Expert Advice and Agent Typification},
author = {Stefano V. Albrecht and Jacob W. Crandall and Subramanian Ramamoorthy},
booktitle = {AAAI Workshop on Multiagent Interaction without Prior Coordination},
address = {Austin, Texas, USA},
month = {January},
year = {2015}
}
2014
Stefano V. Albrecht, Subramanian Ramamoorthy
On Convergence and Optimality of Best-Response Learning with Policy Types in Multiagent Systems
Conference on Uncertainty in Artificial Intelligence, 2014
Abstract | BibTex | arXiv | Appendix
UAIagent-modelling
Abstract:
While many multiagent algorithms are designed for homogeneous systems (i.e. all agents are identical), there are important applications which require an agent to coordinate its actions without knowing a priori how the other agents behave. One method to make this problem feasible is to assume that the other agents draw their latent policy (or type) from a specific set, and that a domain expert could provide a specification of this set, albeit only a partially correct one. Algorithms have been proposed by several researchers to compute posterior beliefs over such policy libraries, which can then be used to determine optimal actions. In this paper, we provide theoretical guidance on two central design parameters of this method: Firstly, it is important that the user choose a posterior which can learn the true distribution of latent types, as otherwise suboptimal actions may be chosen. We analyse convergence properties of two existing posterior formulations and propose a new posterior which can learn correlated distributions. Secondly, since the types are provided by an expert, they may be inaccurate in the sense that they do not predict the agents’ observed actions. We provide a novel characterisation of optimality which allows experts to use efficient model checking algorithms to verify optimality of types.
@inproceedings{ albrecht2014convergence,
title = {On Convergence and Optimality of Best-Response Learning with Policy Types in Multiagent Systems},
author = {Stefano V. Albrecht and Subramanian Ramamoorthy},
booktitle = {Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence},
pages = {12--21},
year = {2014}
}
2013
Stefano V. Albrecht, Subramanian Ramamoorthy
A Game-Theoretic Model and Best-Response Learning Method for Ad Hoc Coordination in Multiagent Systems
International Conference on Autonomous Agents and Multiagent Systems, 2013
Abstract | BibTex | arXiv (full technical report) | Extended Abstract
AAMASad-hoc-teamworkagent-modelling
Abstract:
The ad hoc coordination problem is to design an autonomous agent which is able to achieve optimal flexibility and efficiency in a multiagent system with no mechanisms for prior coordination. We conceptualise this problem formally using a game-theoretic model, called the stochastic Bayesian game, in which the behaviour of a player is determined by its private information, or type. Based on this model, we derive a solution, called Harsanyi-Bellman Ad Hoc Coordination (HBA), which utilises the concept of Bayesian Nash equilibrium in a planning procedure to find optimal actions in the sense of Bellman optimal control. We evaluate HBA in a multiagent logistics domain called level-based foraging, showing that it achieves higher flexibility and efficiency than several alternative algorithms. We also report on a human-machine experiment at a public science exhibition in which the human participants played repeated Prisoner's Dilemma and Rock-Paper-Scissors against HBA and alternative algorithms, showing that HBA achieves equal efficiency and a significantly higher welfare and winning rate.
@inproceedings{ albrecht2013game,
title = {A Game-Theoretic Model and Best-Response Learning Method for Ad Hoc Coordination in Multiagent Systems},
author = {Stefano V. Albrecht and Subramanian Ramamoorthy},
booktitle = {Proceedings of the 12th International Conference on Autonomous Agents and Multiagent Systems},
address = {St. Paul, Minnesota, USA},
month = {May},
year = {2013}
}
2012
Stefano V. Albrecht, Subramanian Ramamoorthy
Comparative Evaluation of Multiagent Learning Algorithms in a Diverse Set of Ad Hoc Team Problems
International Conference on Autonomous Agents and Multiagent Systems, 2012
Abstract | BibTex | arXiv
AAMASmulti-agent-rlad-hoc-teamwork
Abstract:
This paper is concerned with evaluating different multiagent learning (MAL) algorithms in problems where individual agents may be heterogenous, in the sense of utilizing different learning strategies, without the opportunity for prior agreements or information regarding coordination. Such a situation arises in ad hoc team problems, a model of many practical multiagent systems applications. Prior work in multiagent learning has often been focussed on homogeneous groups of agents, meaning that all agents were identical and a priori aware of this fact. Also, those algorithms that are specifically designed for ad hoc team problems are typically evaluated in teams of agents with fixed behaviours, as opposed to agents which are adapting their behaviours. In this work, we empirically evaluate five MAL algorithms, representing major approaches to multiagent learning but originally developed with the homogeneous setting in mind, to understand their behaviour in a set of ad hoc team problems. All teams consist of agents which are continuously adapting their behaviours. The algorithms are evaluated with respect to a comprehensive characterisation of repeated matrix games, using performance criteria that include considerations such as attainment of equilibrium, social welfare and fairness. Our main conclusion is that there is no clear winner. However, the comparative evaluation also highlights the relative strengths of different algorithms with respect to the type of performance criteria, e.g., social welfare vs. attainment of equilibrium.
@inproceedings{ albrecht2012comparative,
title = {Comparative Evaluation of {MAL} Algorithms in a Diverse Set of Ad Hoc Team Problems},
author = {Stefano V. Albrecht and Subramanian Ramamoorthy},
booktitle = {Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems},
pages = {349--356},
year = {2012}
}