Multi-pass deep Q-networks for reinforcement learning with parameterised action spaces

dc.contributor.authorBester, Craig James
dc.date.accessioned2020-09-09T08:08:22Z
dc.date.available2020-09-09T08:08:22Z
dc.date.issued2019
dc.descriptiondissertation submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in fulfilment of the requirements for the degree of Master of Science, Johannesburg June 2019en_ZA
dc.description.abstractParameterised actions in reinforcement learning are composed of discrete actions with continuous actionparameters. This provides a framework capable of solving complex domains that require learning highlevel action policies with flexible control. Recently, deep Q-networks have been extended to learn over such action spaces with the P-DQN algorithm. However, the method treats all action-parameters as a single joint input to the Q-network, invalidating its theoretical foundations. We demonstrate the disadvantages of this approach and propose two solutions: using split Q-networks, and a novel multi-pass technique. We also propose a weighted-indexed action-parameter loss function to address issues related to the imbalance of sampling and exploration between different parameterised actions. We empirically demonstrate that both our multi-pass algorithm and weighted-indexed loss significantly outperform P-DQN and other previous algorithms in terms of data efficiency and converged policy performance on the Platform, Robot Soccer Goal, and Half Field Offense domains.en_ZA
dc.description.librarianXN2020en_ZA
dc.facultyFaculty of Scienceen_ZA
dc.format.extentOnline resource (viii, 100 leaves)
dc.identifier.citationBester, Craig James, (2019) Multi-pass deep Q-networks for reinforcement learning with parameterised action spaces, University of the Witwatersrand, Johannesburg, https://hdl.handle.net/10539/29568
dc.identifier.urihttps://hdl.handle.net/10539/29568
dc.language.isoenen_ZA
dc.schoolSchool of Computer Science and Applied Mathematicsen_ZA
dc.subject.lcshMachine learning
dc.subject.lcshComputer multitasking
dc.titleMulti-pass deep Q-networks for reinforcement learning with parameterised action spacesen_ZA
dc.typeThesisen_ZA
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
cbester_msc_dissertation.pdf
Size:
3.34 MB
Format:
Adobe Portable Document Format
Description:
Main work
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:
Collections