School of Computer Science and Applied Mathematics (ETDs)
Permanent URI for this community
Browse
Browsing School of Computer Science and Applied Mathematics (ETDs) by SDG "SDG-8: Decent work and economic growth"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
Item Analyzing the performance and generalisability of incorporating SimCLR into Proximal Policy Optimization in procedurally generated environments(University of the Witwatersrand, Johannesburg, 2024) Gilbert, Nikhil; Rosman, BenjaminMultiple approaches to state representation learning have been shown to improve the performance of reinforcement learning agents substantially. When used in reinforcement learning, a known challenge in state representation learning is enabling an agent to represent environment states with similar characteristics in a manner that would allow said agent to comprehend it as such. We propose a novel algorithm that combines contrastive learning with reinforcement learning so that agents learn to group states by common physical characteristics and action preferences during training. We subsequently generalise these learnings to previously encountered environment obstacles. To enable a reinforcement learning agent to use contrastive learning within its environment interaction loop, we propose a state representation learning model that employs contrastive learning to group states using observations coupled with the action the agent chose within its current state. Our approach uses a combination of two algorithms that we augment to demonstrate the effectiveness of combining contrastive learning with reinforcement learning. The state representation model for contrastive learning is a Simple Framework for Contrastive Learning of Visual Representations (SimCLR) by Chen et al. [2020], which we amend to include action values from the chosen reinforcement learning environment. The policy gradient algorithm (PPO) is our chosen reinforcement learning approach for policy learning, which we combine with SimCLR to form our novel algorithm, Action Contrastive Policy Optimization (ACPO). When combining these augmented algorithms for contrastive reinforcement learning, our results show significant improvement in training performance and generalisation to unseen environment obstacles of similar structure (physical layout of interactive objects) and mechanics (the rules of physics and transition probabilities).Item Regime Based Portfolio Optimization: A Look at the South African Asset Market(University of the Witwatersrand, Johannesburg, 2023-09) Mdluli, Nkosenhle S.; Ajoodha, Ritesh; Mulaudzi, RudzaniFinancial markets change their properties (i.e mean, volatility, correlation, and distribution) with time. However, traditional portfolio optimization strategies seek to create static, all weather portfolios oblivious to this and current economic conditions. This produces portfolios that are unable to predict events with excessive skewness and kurtosis. This research investigated the difference in portfolio percentage return, of portfolios that incorporate regimes against one that does not. HMMs, binary segmentation, and PELT algorithms were used to identify regimes in 7 macro-economic features. These regimes, with regimes identified by the SARB, were incorporated into Markowitz’s mean-variance optimization technique to optimize portfolios. The base portfolio, which did not incorporate regimes, produced the least return of 761% during the period under consideration. Portfolios using HMMs identified regimes, produced, on average, the highest returns, averaging 3211% whilst the portfolio using SARB identified regimes returned 1878% during the same period. This research, therefore, shows that incorporating regimes into portfolio optimization increases the percentage return of a portfolio. Moreover, it shows that, although HMMs, on average, produced the most profitable portfolio, portfolios using regimes based on data-driven techniques do not always out-perform portfolios using the SARB identified regimes.