Using machine learning to learn from demonstration: application to the AR.Drone quadrotor control
 No Thumbnail Available 
Files
Date
2016-05-10
Authors
Fu, Kuan-Hsiang
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Developing a robot that can operate autonomously is an active area in robotics research. An autonomously
operating robot can have a tremendous number of applications such as: surveillance and inspection;
search and rescue; and operating in hazardous environments. Reinforcement learning, a branch of machine
learning, provides an attractive framework for developing robust control algorithms since it is less
demanding in terms of both knowledge and programming effort. Given a reward function, reinforcement
learning employs a trial-and-error concept to make an agent learn. It is computationally intractable
in practice for an agent to learn “de novo”, thus it is important to provide the learning system with “a
priori” knowledge. Such prior knowledge would be in the form of demonstrations performed by the
teacher. However, prior knowledge does not necessarily guarantee that the agent will perform well. The
performance of the agent usually depends on the reward function, since the reward function describes
the formal specification of the control task. However, problems arise with complex reward function
that are difficult to specify manually. In order to address these problems, apprenticeship learning via
inverse reinforcement learning is used. Apprenticeship learning via inverse reinforcement learning can
be used to extract a reward function from the set of demonstrations so that the agent can optimise its
performance with respect to that reward function. In this research, a flight controller for the Ar.Drone
quadrotor was created using a reinforcement learning algorithm and function approximators with some
prior knowledge. The agent was able to perform a manoeuvre that is similar to the one demonstrated by
the teacher.
Description
A dissertation submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in fulfilment of requirements for the degree of Master of Science. December 14, 2015