James, Steven2022-07-282022-07-282021https://hdl.handle.net/10539/33061A thesis submitted to the Faculty of Science, University of the Witwatersrand, in fulfilment of the requirements for the degree of Doctor of Philosophy, 2021A major goal of artificial intelligence (AI) is to create agents capable of acting effectively in a wide variety of complex environments. A popular framework for modelling decision-making agents is reinforcement learning (RL), where an agent learns from interaction with its environment. Though RL has proven successful in solving a number of challenging tasks, one major hurdle to the development of truly autonomous agents is the need to specify appropriate task representations. In RL, the most common approach is for a human designer to simply provide the agent with the task description by defining the state space, rewards, goals and actions available to the agent. While this approach is feasible within the bounds of narrowly defined tasks, it must clearly be dispensed with if we are ever to construct agents with full autonomy. In this thesis, we concern ourselves with the question of how an agent can acquire its own representations from sensory data. We restrict our focus to learning representations for long-term planning, a class of problems that state-of-the-art learning methods are unable to solve. We take inspiration from the way humans reason about the world—although we must sense and act in the real world, we do not reason at such a low level. Rather, we use mental abstractions of our environment that ignore irrelevant minutiae. When acting, we can make use of abstraction to employ high-level skills, known in RL as options. By learning and planning with both state and action abstractions, we are ultimately able to construct plans consisting of thousands of actions. Importantly, a feature of human intelligence is that we are proficient at a wide array of tasks. One key aspect that allows us to quickly solve new problems is our ability to reuse previously learned abstract representations. For example, once we acquire a conceptual representation of a door, we can simply apply this to any new doors we may encounter, independent of the lighting conditions, the location of the door or its colour. Since tabula rasa learning is infeasible for robots, learning transferable representations is key to scaling AI approaches to real-world agents. We propose various methods for autonomously learning symbolic representations of an agent’s environment. Importantly, these symbols are task-independent, and so can be recycled to solve new tasks. In particular, we make three main contributions. First, we demonstrate how an agent can use an existing set of options to acquire representations from egocentric observations. Since the resulting abstractions are agent-centric, they can immediately be reused by the same agent in new environments. We show how to combine these portable representations with problem-specific ones to generate a sound task description that can be used for abstract planning. Our results demonstrate that our approach allows an agent to transfer previous knowledge to new tasks, improving sample efficiency as the number of tasks increase. Our second contribution is to leverage the fact that the real world consists of objects that an agent can observe and interact with. Based on this assumption, we show how to construct object-centric abstractions that can be used when an agent finds itself in a new task containing similar objects. As a result, an agent can convert observations from a high-dimensional environment (such as a video game) to object-centric textual symbols that can be given as input to classical planners. Once more, the transferability of the learned representations allows an agent to learn subsequent tasks using fewer environment observations. Finally we show how to autonomously construct a multi-level hierarchy consisting of increasingly abstract representations. Since these hierarchies are transferable, higher-order concepts can be reused in new tasks, precluding the agent from re-learning them and improving sample efficiency. The hierarchy further allows the agent to plan at a variety of levels, reducing the size of the problem and there by improving planning efficiencyenLearning portable symbolic representationsThesis