Representation discovery using a fixed basis in reinforcement learning

Wookey, Dean Stephen

Representation discovery using a fixed basis in reinforcement learning

dc.contributor.author	Wookey, Dean Stephen
dc.date.accessioned	2017-01-18T08:06:46Z
dc.date.available	2017-01-18T08:06:46Z
dc.date.issued	2016
dc.description	A thesis presented for the degree of Doctor of Philosophy, School of Computer Science and Applied Mathematics. University of the Witwatersrand, South Africa. 26 August 2016.	en_ZA
dc.description.abstract	In the reinforcement learning paradigm, an agent learns by interacting with its environment. At each state, the agent receives a numerical reward. Its goal is to maximise the discounted sum of future rewards. One way it can do this is through learning a value function; a function which maps states to the discounted sum of future rewards. With an accurate value function and a model of the environment, the agent can take the optimal action in each state. In practice, however, the value function is approximated, and performance depends on the quality of the approximation. Linear function approximation is a commonly used approximation scheme, where the value function is represented as a weighted sum of basis functions or features. In continuous state environments, there are infinitely many such features to choose from, introducing the new problem of feature selection. Existing algorithms such as OMP-TD are slow to converge, scale poorly to high dimensional spaces, and have not been generalised to the online learning case. We introduce heuristic methods for reducing the search space in high dimensions that significantly reduce computational costs and also act as regularisers. We extend these methods and introduce feature regularisation for incremental feature selection in the batch learning case, and show that introducing a smoothness prior is effective with our SSOMP-TD and STOMP-TD algorithms. Finally we generalise OMP-TD and our algorithms to the online case and evaluate them empirically.	en_ZA
dc.description.librarian	LG2017	en_ZA
dc.format.extent	Online resource (v, 74 leaves)
dc.identifier.citation	Wookey, Dean Stephen (2016) Representation discovery using a fixed basis in reinforcement learning, University of Witwatersrand, Johannesburg, <http://wiredspace.wits.ac.za/handle/10539/21642>
dc.identifier.uri	http://hdl.handle.net/10539/21642
dc.language.iso	en	en_ZA
dc.subject.lcsh	Reinforcement learning
dc.title	Representation discovery using a fixed basis in reinforcement learning	en_ZA
dc.type	Thesis	en_ZA

Files

Original bundle

Now showing 1 - 1 of 1

Name:: wookeythesisfinal.pdf
Size:: 4.1 MB
Format:: Adobe Portable Document Format
Description:: Main article

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

ETD Collection