Formation Strategy Optimization Using Multi-Agent Reinforcement Learning
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
University of the Witwatersrand, Johannesburg
Abstract
A long-lasting goal of artificial intelligence is to design agents capable of cooperative problem-solving. The RoboCup soccer competition provides a challenging environment for investigating the design of such intelligent and autonomous agents using machine learning techniques, such as Multi-Agent Reinforcement Learning (MARL). Cooperation is inherently difficult due to the need for agents to align their strategies, adapt to each other’s actions, and make decisions that benefit the collective goal over individual success. In this context, developing effective defensive strategies is particularly challenging. It requires agents to not only understand and anticipate the actions of opponents but also to coordinate with teammates in a dynamic environment where split-second decisions can determine the outcome of a game. This complexity is compounded by the unpredictable nature of the opponent’s strategies and the continuous adaptation required to counter them effectively. This research aims to investigate the application of Multi-Agent Reinforcement Learning in learning an effective defensive strategy in the RoboCup soccer competition. We use reward shaping to positively influence the behaviour of our simulated soccer players such that they can effectively defend against attacking soccer strategies. This reward function is then utilized to train agents and learn the optimal policy in centralized settings, namely Central Proximal Policy Optimization (CPPO). The training process involves exposing our agents to different fixed policies such as the Keepaway approach, where a team keeps the ball away from opponents, the Half-field offense strategy where the objective of the offense team is to strategically outplay the defense team to score goals, and the random direction changes aiming to mimic the unpredictability of human soccer, where players often change direction suddenly to evade defenders or create attacking opportunities. To evaluate our proposed strategy, we conduct a comparative analysis against established baselines like the NeuroHassle approach, which emphasizes early disruption of the opponent’s attack, and the Stable Marriage approach, focusing on optimal defender-attacker pairings. These methods, one based on reinforcement learning and the other on preference-based pairing, serve as benchmarks to gauge the effectiveness of our strategy in improving defensive gameplay. Evaluation metrics, including goals conceded and average distance between opponent players and our goal are used to analyze and identify the strengths and weaknesses of each approach. We evaluated our approach against the NeuroHassle and Stable Marriage methods by observing agent performance in keepaway, Half-field, and random direction changes offense scenarios. Utilizing those key metrics, we identify that our model demonstrated better defense strategies, offering insights for enhancing multi-agent systems in competitive environments like RoboCup soccer.
Description
A thesis submitted in partial fulfilment of the requirements for the Degree of Master of Science, to the Faculty of Science, School of Computer Science and Applied Mathematics, University of the Witwatersrand, Johannesburg, 2024.
Citation
Njupoun, Abdel Mfougouon. (2024). Formation Strategy Optimization Using Multi-Agent Reinforcement Learning. [Master's dissertation, University of the Witwatersrand, Johannesburg]. WIReDSpace. https://hdl.handle.net/10539/46675