ETD Collection

Permanent URI for this collectionhttps://wiredspace.wits.ac.za/handle/10539/104


Please note: Digitised content is made available at the best possible quality range, taking into consideration file size and the condition of the original item. These restrictions may sometimes affect the quality of the final published item. For queries regarding content of ETD collection please contact IR specialists by email : IR specialists or Tel : 011 717 4652 / 1954

Follow the link below for important information about Electronic Theses and Dissertations (ETD)

Library Guide about ETD

Browse

Search Results

Now showing 1 - 1 of 1
  • Item
    Multi-pass deep Q-networks for reinforcement learning with parameterised action spaces
    (2019) Bester, Craig James
    Parameterised actions in reinforcement learning are composed of discrete actions with continuous actionparameters. This provides a framework capable of solving complex domains that require learning highlevel action policies with flexible control. Recently, deep Q-networks have been extended to learn over such action spaces with the P-DQN algorithm. However, the method treats all action-parameters as a single joint input to the Q-network, invalidating its theoretical foundations. We demonstrate the disadvantages of this approach and propose two solutions: using split Q-networks, and a novel multi-pass technique. We also propose a weighted-indexed action-parameter loss function to address issues related to the imbalance of sampling and exploration between different parameterised actions. We empirically demonstrate that both our multi-pass algorithm and weighted-indexed loss significantly outperform P-DQN and other previous algorithms in terms of data efficiency and converged policy performance on the Platform, Robot Soccer Goal, and Half Field Offense domains.