Efficient curriculum generation for reinforcement learning

No Thumbnail Available

Date

2021

Authors

Dunn, Leroy

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Complex tasks involving large state spaces are difficult for Reinforcement Learning agents to solve since the agents learn by trial and error, requiring a large amount of interaction with their environment. Curriculum Learning in Reinforcement Learning aims to accelerate training or improve performance of a Reinforcement Learning agent on a complex target task by training the agent on a sequence of intermediate tasks (a curriculum) beforehand. Current automated approaches to generate a curriculum typically treat the cost of curriculum generation as a sunk cost since it is challenging to assess how much time, effort, and prior knowledge is used to design a curriculum, and focus instead on evaluating the curricula. This is a limitation when aiming to accelerate agent training since the time to generate a curriculum may outweigh the time to directly learn the target task. We study the cost of curriculum generation with the objective to efficiently design high-quality curric ula. We propose a simple measure of the cost of curriculum generation (the total time of intermediate training) and introduce a novel curriculum generation algorithm to minimize this cost. Our algorithm quantifies and learns task utility online to rank candidate curricula and excels against comparable cur riculum generation methods when utilizing the proposed measure. We evaluate our algorithm in two domains and show that it generates high-quality curricula under time restrictions as compared to base line methods and is faster than baseline methods to generate an optimal curriculum of a curriculum search space

Description

A dissertation submitted in fulfilment of the requirements for the degree Master of Science to the Faculty of Science, School of Computer Science and Applied Mathematics, University of the Witwatersrand, Johannesburg, 2021

Keywords

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By