Improving central value functions for cooperative multi-agent reinforcement learning

dc.contributor.authorSingh, Siddarth
dc.date.accessioned2023-11-21T07:47:54Z
dc.date.available2023-11-21T07:47:54Z
dc.date.issued2022
dc.descriptionA dissertation submitted in fulfilment of the requirements for the degree of Master of Science to the Faculty of Science, University of the Witwatersrand, Johannesburg, 2022
dc.description.abstractCentral value functions (CVFs) are methods which use a shared centralised critic to decompose the global shared reward in the cooperative settings into individual local rewards. CVFs are an effective method for value decomposition in multiagent reinforcement learning problems. However many state-of-the-art methods are reliant on an easily defined ground truth state to perform credit assignment. These methods perform poorly in certain environments with high numbers of redundant agents. We propose a method called Relevance Decomposition Network (RDN) that makes use of layerwise-relevance propagation (LRP) as an alternative form of credit assignment that can better perform value decomposition with large numbers of redundant agents when compared to existing methods like Qmix and Value-Decomposition Network (VDN). Another limitation in the MARL space is that it has generally favoured Q-learning based algorithms. This can be attributed to the belief that due to the poor sample efficiency of on-policy learning they are ineffective in the large action and state spaces in the Multi-Agent setting. We make use of a small set of improvements that can be generalised to most on-policy actor-critic algorithms to accommodate a small amount of off-policy data to improve sample efficiency and increase training stability. We implemented our improved agent variants and test them in a variety of environments including the Starcraft multi-agent challenge (SMAC). Our proposed method was able able to greatly improve the performance of a basic naive multi-agent advantage actor-critic algorithm with faster convergence to high-performing policies and reduced variance in expected performance at all stages of training.
dc.description.librarianPC(2023)
dc.facultyFaculty of Science
dc.identifier.urihttps://hdl.handle.net/10539/37053
dc.language.isoen
dc.schoolComputer Science and Applied Mathematics
dc.subjectCentral value functions
dc.subjectMulti-agent reinforcement learning
dc.subjectRelevance Decomposition Network (RDN)
dc.titleImproving central value functions for cooperative multi-agent reinforcement learning
dc.typeDissertation
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
msc_dissertation.pdf
Size:
6.71 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.43 KB
Format:
Item-specific license agreed upon to submission
Description:
Collections