Dynamics generalisation in reinforcement learning through the use of adaptive policies

Beukman, Michael

Dynamics generalisation in reinforcement learning through the use of adaptive policies

Files

Michael Beukman 1825748 MSc Dissertation.pdf (3.31 MB)

Date

2024

Authors

Beukman, Michael

Abstract

Reinforcement learning (RL) is a widely-used method for training agents to interact with an external environment, and is commonly used in fields such as robotics. While RL has achieved success in several domains, many methods fail to generalise well to scenarios different from those encountered during training. This is a significant limitation that hinders RL’s real-world applicability. In this work, we consider the problem of generalising to new transition dynamics, corresponding to cases in which the effects of the agent’s actions differ; for instance, walking on a slippery vs. rough floor. To address this problem, we introduce a neural network architecture, the Decision Adapter, which leverages contextual information to modulate the behaviour of an agent, depending on the setting it is in. In particular, our method uses the context – information about the current environment, such as the floor’s friction – to generate the weights of an adapter module which influences the agent’s actions. This, for instance, allows an agent to act differently when walking on ice compared to gravel. We theoretically show that our approach generalises a prior network architecture and empirically demonstrate that it results in superior generalisation performance compared to previous approaches in several environments. Furthermore, we show that our method can be applied to multiple RL algorithms, making it a widely-applicable approach to improve generalisation

Description

A research report submitted in partial fulfilment of the requirements for the degree Master of Science to the Faculty of Science, School of Computer Science and Applied Mathematics, University of the Witwatersrand, Johannesburg, 2023

Keywords

Reinforcement learning, Robotics

URI

https://hdl.handle.net/10539/37446

Collections

ETD Collection

Full item page

Dynamics generalisation in reinforcement learning through the use of adaptive policies

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By