Multi-pass deep Q-networks for reinforcement learning with parameterised action spaces
dc.contributor.author | Bester, Craig James | |
dc.date.accessioned | 2020-09-09T08:08:22Z | |
dc.date.available | 2020-09-09T08:08:22Z | |
dc.date.issued | 2019 | |
dc.description | dissertation submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in fulfilment of the requirements for the degree of Master of Science, Johannesburg June 2019 | en_ZA |
dc.description.abstract | Parameterised actions in reinforcement learning are composed of discrete actions with continuous actionparameters. This provides a framework capable of solving complex domains that require learning highlevel action policies with flexible control. Recently, deep Q-networks have been extended to learn over such action spaces with the P-DQN algorithm. However, the method treats all action-parameters as a single joint input to the Q-network, invalidating its theoretical foundations. We demonstrate the disadvantages of this approach and propose two solutions: using split Q-networks, and a novel multi-pass technique. We also propose a weighted-indexed action-parameter loss function to address issues related to the imbalance of sampling and exploration between different parameterised actions. We empirically demonstrate that both our multi-pass algorithm and weighted-indexed loss significantly outperform P-DQN and other previous algorithms in terms of data efficiency and converged policy performance on the Platform, Robot Soccer Goal, and Half Field Offense domains. | en_ZA |
dc.description.librarian | XN2020 | en_ZA |
dc.faculty | Faculty of Science | en_ZA |
dc.format.extent | Online resource (viii, 100 leaves) | |
dc.identifier.citation | Bester, Craig James, (2019) Multi-pass deep Q-networks for reinforcement learning with parameterised action spaces, University of the Witwatersrand, Johannesburg, https://hdl.handle.net/10539/29568 | |
dc.identifier.uri | https://hdl.handle.net/10539/29568 | |
dc.language.iso | en | en_ZA |
dc.school | School of Computer Science and Applied Mathematics | en_ZA |
dc.subject.lcsh | Machine learning | |
dc.subject.lcsh | Computer multitasking | |
dc.title | Multi-pass deep Q-networks for reinforcement learning with parameterised action spaces | en_ZA |
dc.type | Thesis | en_ZA |
Files
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- cbester_msc_dissertation.pdf
- Size:
- 3.34 MB
- Format:
- Adobe Portable Document Format
- Description:
- Main work
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 1.71 KB
- Format:
- Item-specific license agreed upon to submission
- Description: