Reinforcement learning with parameterized actions

Masson, Warwick Anthony

Reinforcement learning with parameterized actions

dc.contributor.author	Masson, Warwick Anthony
dc.date.accessioned	2017-01-18T07:37:29Z
dc.date.available	2017-01-18T07:37:29Z
dc.date.issued	2016
dc.description	A dissertation submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in fulfilment of requirements for the degree of Master of Science. Johannesburg, 2016.	en_ZA
dc.description.abstract	In order to complete real-world tasks, autonomous robots require a mix of fine-grained control and high-level skills. A robot requires a wide range of skills to handle a variety of different situations, but must also be able to adapt its skills to handle a specific situation. Reinforcement learning is a machine learning paradigm for learning to solve tasks by interacting with an environment. Current methods in reinforcement learning focus on agents with either a fixed number of discrete actions, or a continuous set of actions. We consider the problem of reinforcement learning with parameterized actions—discrete actions with continuous parameters. At each step the agent must select both which action to use and which parameters to use with that action. By representing actions in this way, we have the high level skills given by discrete actions and adaptibility given by the parameters for each action. We introduce the Q-PAMDP algorithm for model-free learning in parameterized action Markov decision processes. Q-PAMDP alternates learning which discrete actions to use in each state and then which parameters to use in those states. We show that under weak assumptions, Q-PAMDP converges to a local maximum. We compare Q-PAMDP with a direct policy search approach in the goal and Platform domains. Q-PAMDP out-performs direct policy search in both domains.	en_ZA
dc.description.librarian	TG2016	en_ZA
dc.format.extent	Online resource (46 leaves)
dc.identifier.citation	Masson, Warwick Anthony (2016) Reinforcement learning with parameterized actions, University of Witwatersrand, Johannesburg, <http://wiredspace.wits.ac.za/handle/10539/21639>
dc.identifier.uri	http://hdl.handle.net/10539/21639
dc.language.iso	en	en_ZA
dc.subject.lcsh	Reinforcement learning
dc.title	Reinforcement learning with parameterized actions	en_ZA
dc.type	Thesis	en_ZA

Files

Original bundle

Now showing 1 - 4 of 4

Name:: warwick_masson_reinforcement_learning_with_parameterized_act.pdf
Size:: 2.29 MB
Format:: Adobe Portable Document Format
Description:: Main article

Download

Name:: declaration.jpg
Size:: 1.47 MB
Format:: Joint Photographic Experts Group/JPEG File Interchange Format (JFIF)
Description:: Declaration

Reinforcement learning with parameterized actions

Files

Original bundle

License bundle

Collections