Electronic Theses and Dissertations (Masters)

Permanent URI for this collectionhttps://hdl.handle.net/10539/38006

Browse

Search Results

Now showing 1 - 9 of 9
  • Thumbnail Image
    Item
    Procedural Content Generation for video game levels with human advice
    (University of the Witwatersrand, Johannesburg, 2023-07) Raal, Nicholas Oliver; James, Steven
    Video gaming is an extremely popular form of entertainment around the world and new video game releases are constantly being showcased. One issue with the video gaming industry is that game developers require a large amount of time to develop new content. A research field that can help with this is procedural content generation (PCG) which allows for an infinite number of video game levels to be generated based on the parameters provided. Many of the methods found in literature can generate content reliably that adhere to quantifiable characteristics such as playability, solvability and difficulty. These methods do not however, take into account the aesthetics of the level which is the parameter that makes them more reasonable levels for human players. In order to address this issue, we propose a method of incorporating high level human advice into the PCG loop. The method uses pairwise comparisons as a way in which a score can be assigned to a level based on its aesthetics. Using the score along with a feature vector describing each level, an SVR model is trained that will allow for a score to be assigned to unseen video game levels. This predicted score is used as an additional fitness function of a multi objective genetic algorithm (GA) and can be optimised as a standard fitness function would. We test the proposed method on two 2D platformer video games, Maze and Super Mario Bros (SMB), and our results show that the proposed method can successfully be used to generate levels with a bias towards the human preferred aesthetical features, whilst still adhering to standard video game characteristics such as solvability. We further investigate incorporating multiple inputs from a human at different stages of the PCG life cycle and find that it does improve the proposed method, but further testing is still required. The findings of this research is hopefully going to assist in using PCG in the video game space to create levels that are more aesthetically pleasing to a human player.
  • Thumbnail Image
    Item
    A Continuous Reinforcement Learning Approach to Self-Adaptive Particle Swarm Optimisation
    (University of the Witwatersrand, Johannesburg, 2023-08) Tilley, Duncan; Cleghorn, Christopher
    Particle Swarm Optimisation (PSO) is a popular black-box optimisation technique due to its simple implementation and surprising ability to perform well on various problems. Unfortunately, PSO is fairly sensitive to the choice of hyper-parameters. For this reason, many self-adaptive techniques have been proposed that attempt to both simplify hyper-parameter selection and improve the performance of PSO. Surveys however show that many self-adaptive techniques are still outperformed by time-varying techniques where the value of coefficients are simply increased or decreased over time. More recent works have shown the successful application of Reinforcement Learning (RL) to learn self-adaptive control policies for optimisers such as differential evolution, genetic algorithms, and PSO. However, many of these applications were limited to only discrete state and action spaces, which severely limits the choices available to a control policy, given that the PSO coefficients are continuous variables. This dissertation therefore investigates the application of continuous RL techniques to learn a self-adaptive control policy that can make full use of the continuous nature of the PSO coefficients. The dissertation first introduces the RL framework used to learn a continuous control policy by defining the environment, action-space, state-space, and a number of possible reward functions. An effective learning environment that is able to overcome the difficulties of continuous RL is then derived through a series of experiments, culminating in a successfully learned continuous control policy. The policy is then shown to perform well on the benchmark problems used during training when compared to other self-adaptive PSO algorithms. Further testing on benchmark problems not seen during training suggest that the learned policy may however not generalise well to other functions, but this is shown to also be a problem in other PSO algorithms. Finally, the dissertation performs a number of experiments to provide insights into the behaviours learned by the continuous control policy.
  • Thumbnail Image
    Item
    Self Supervised Salient Object Detection using Pseudo-labels
    (University of the Witwatersrand, Johannesburg, 2023-08) Bachan, Kidhar; Wang, Hairong
    Deep Convolutional Neural Networks have dominated salient object detection methods in recent history. A determining factor for salient object detection network performance is the quality and quantity of pixel-wise annotated labels. This annotation is performed manually, making it expensive (time-consuming, tedious), while limiting the training data to the available annotated datasets. Alternatively, unsupervised models are able to learn from unlabelled datasets or datasets in the wild. In this work, an existing algorithm [Li et al. 2020] is used to refine the generated pseudo labels before training. This research focuses on the changes made to the pseudo label refinement algorithm and its effect on performance for unsupervised saliency object detection tasks. We show that using this novel approach leads to statistically negligible performance improvements and discuss the reasons why this is the case.
  • Thumbnail Image
    Item
    Evaluating Pre-training Mechanisms in Deep Learning Enabled Tuberculosis Diagnosis
    (University of the Witwatersrand, Johannesburg, 2024) Zaranyika, Zororo; Klein, Richard
    Tuberculosis (TB) is an infectious disease caused by a bacteria called Mycobacterium Tuberculosis. In 2021, 10.6 million people fell ill because of TB and about 1.5 million lives are lost from TB each year even though TB is a preventable and curable disease. The latest global trends in TB death cases are shown in 1.1. To ensure a higher survival rate and prevent further transmissions, it is important to carry out early diagnosis. One of the critical methods of TB diagnosis and detection is the use of posterior-anterior chest radiographs (CXR). The diagnosis of Tuberculosis and other chest-affecting dis- eases like Pneumoconiosis is time-consuming, challenging and requires experts to read and interpret chest X-ray images, especially in under-resourced areas. Various attempts have been made to perform the diagnosis using deep learning methods such as Convolutional Neural Networks (CNN) using labelled CXR images. Due to the nature of CXR images in maintaining a consistent structure and overlapping visual appearances across different chest-affecting diseases, it is reasonable to believe that visual features learned in one disease or geographic location may transfer to a new TB classificationmodel. This would allow us to leverage large volumes of labelled CXR images available online hence decreasing the data required to build a local model. This work will explore to what extent such pre-training and transfer learning is useful and whether it may help decrease the data required for a locally trained classifier. In this research, we investigated various pre-training regimes using selected online datasets to under- stand whether the performance of such models can be generalised towards building a TB computer-aided diagnosis system and also inform us on the nature and size of CXR datasets we should be collecting. Our experiment results indicated that both supervised and self-supervised pre-training between the CXR datasets cannot significantly improve the overall performance metrics of a TB. We noted that pre-training on the ChestX-ray14, CheXpert, and MIMIC-CXR datasets resulted in recall values of over 70% and specificity scores of at least 90%. There was a general decline in performance in our experiments when we pre-trained on one dataset and fine-tuned on a different dataset, hence our results were lower than baseline experiment results. We noted that ImageNet weights initialisation yields superior results over random weights initialisation on all ex- periment configurations. In the case of self-supervised pre-training, the model reached acceptable metrics with a minimum number of labels as low as 5% when we fine-tuned on the TBX11k dataset, although slightly lower in performance compared to the super-vised pre-trained models and the baseline results. The best-performing self-supervised pre-trained model with the least number of training labels was the MoCo-ResNet-50 model pre-trained on the VinDr-CXR and PadChest datasets. These model configura- tions achieved recall scores of 81.90% and a specificity score of 81.99% on VinDr-CXR pre-trained weights while the PadChest weights scored a recall of 70.29% and a speci- ficity of 70.22%. The other self-supervised pre-trained models failed to reach scores of at least 50% on both recall or specificity with the same number of labels
  • Thumbnail Image
    Item
    Regime Based Portfolio Optimization: A Look at the South African Asset Market
    (University of the Witwatersrand, Johannesburg, 2023-09) Mdluli, Nkosenhle S.; Ajoodha, Ritesh; Mulaudzi, Rudzani
    Financial markets change their properties (i.e mean, volatility, correlation, and distribution) with time. However, traditional portfolio optimization strategies seek to create static, all weather portfolios oblivious to this and current economic conditions. This produces portfolios that are unable to predict events with excessive skewness and kurtosis. This research investigated the difference in portfolio percentage return, of portfolios that incorporate regimes against one that does not. HMMs, binary segmentation, and PELT algorithms were used to identify regimes in 7 macro-economic features. These regimes, with regimes identified by the SARB, were incorporated into Markowitz’s mean-variance optimization technique to optimize portfolios. The base portfolio, which did not incorporate regimes, produced the least return of 761% during the period under consideration. Portfolios using HMMs identified regimes, produced, on average, the highest returns, averaging 3211% whilst the portfolio using SARB identified regimes returned 1878% during the same period. This research, therefore, shows that incorporating regimes into portfolio optimization increases the percentage return of a portfolio. Moreover, it shows that, although HMMs, on average, produced the most profitable portfolio, portfolios using regimes based on data-driven techniques do not always out-perform portfolios using the SARB identified regimes.
  • Thumbnail Image
    Item
    Generating Rich Image Descriptions from Localized Attention
    (University of the Witwatersrand, Johannesburg, 2023-08) Poulton, David; Klein, Richard
    The field of image captioning is constantly growing with swathes of new methodologies, performance leaps, datasets, and challenges. One new challenge is the task of long-text image description. While the vast majority of research has focused on short captions for images with only short phrases or sentences, new research and the recently released Localized Narratives dataset have pushed this to rich, paragraph length descriptions. In this work we perform additional research to grow the sub-field of long-text image descriptions and determine the viability of our new methods. We experiment with a variety of progressively more complex LSTM and Transformer-based approaches, utilising human-generated localised attention traces and image data to generate suitable captions, and evaluate these methods on a suite of common language evaluation metrics. We find that LSTM-based approaches are not well suited to the task, and under-perform Transformer-based implementations on our metric suite while also proving substantially more demanding to train. On the other hand, we find that our Transformer-based methods are well capable of generating captions with rich focus over all regions of the image and in a grammatically sound manner, with our most complex model outperforming existing approaches on our metric suite.
  • Thumbnail Image
    Item
    Pipeline for the 3D Reconstruction of Rigid, Handheld Objects through the Use of Static Cameras
    (University of the Witwatersrand, Johannesburg, 2023-04) Kambadkone, Saatwik Ramakrishna; Klein, Richard
    In this paper, we develop a pipeline for the 3D reconstruction of handheld objects using a single, static RGB-D camera. We also create a general pipeline to describe the process of handheld object reconstruction. This general pipeline suggests the deconstruction of this task into three main constituents: input, where we decide our main method of data capture; segmentation and tracking, where we identify and track the relevant parts of our captured data; and reconstruction where we develop a method for reconstructing our previous information into 3D models. We successfully create a handheld object reconstruction method using a depth sensor as our input; hand tracking, depth segmentation and optical flow to retrieve relevant information; and reconstruction through the use of ICP and TSDF maps. During this process, we also evaluate other possible variations of this successful method. In one of these variations, we test the effect of using depth-estimation to generate data as- the input to our pipeline. While this experimentation helps us quantify our method’s robustness to noise in the input data, we do conclude that current depth estimation techniques do not provide adequate detail for the reconstruction of handheld objects.
  • Thumbnail Image
    Item
    Learning to adapt: domain adaptation with cycle-consistent generative adversarial networks
    (University of the Witwatersrand, Johannesburg, 2023) Burke, Pierce William; Klein, Richard
    Domain adaptation is a critical part of modern-day machine learning as many practitioners do not have the means to collect and label all the data they require reliably. Instead, they often turn to large online datasets to meet their data needs. However, this can often lead to a mismatch between the online dataset and the data they will encounter in their own problem. This is known as domain shift and plagues many different avenues of machine learning. From differences in data sources, changes in the underlying processes generating the data, or new unseen environments the models have yet to encounter. All these issues can lead to performance degradation. From the success in using Cycle-consistent Generative Adversarial Networks(CycleGAN) to learn unpaired image-to-image mappings, we propose a new method to help alleviate the issues caused by domain shifts in images. The proposed model incorporates an adversarial loss to encourage realistic-looking images in the target domain, a cycle-consistency loss to learn an unpaired image-to-image mapping, and a semantic loss from a task network to improve the generator’s performance. The task network is con-currently trained with the generators on the generated images to improve downstream task performance on adapted images. By utilizing the power of CycleGAN, we can learn to classify images in the target domain without any target domain labels. In this research, we show that our model is successful on various unsupervised domain adaptation (UDA) datasets and can alleviate domain shifts for different adaptation tasks, like classification or semantic segmentation. In our experiments on standard classification, we were able to bring the models performance to near oracle level accuracy on a variety of different classification datasets. The semantic segmentation experiments showed that our model could improve the performance on the target domain, but there is still room for further improvements. We also further analyze where our model performs well and where improvements can be made.
  • Thumbnail Image
    Item
    Generative Model Based Adversarial Defenses for Deepfake Detectors
    (University of the Witwatersrand, Johannesburg, 2023-08) Kavilan Dhavan, Nair; Klein, Richard
    Deepfake videos present a serious threat to society as they can be used to spread mis-information through social media. Convolutional Neural Networks (CNNs) have been effective in detecting deepfake videos, but they are vulnerable to adversarial attacks that can compromise their accuracy. This vulnerability can be exploited by deepfake creators to evade detection. In this study, we evaluate the effectiveness of two genera- tive adversarial defense mechanisms, APE-GAN and MagNet, in the context of deepfake detection. We use the FaceForensics++ dataset and a CNN victim model based on the XceptionNet architecture, which we attack using the iterative fast gradient sign method at two different levels of ✏, ✏ = 0.0001 and ✏ = 0.01. We find that both APE-GAN and MagNet can purify the adversarial images and restore the performance of the vic- tim model to within 10% of the model’s accuracy on benign fake inputs. However, these methods were less effective at restoring accuracy for adversarial real examples and were not able to significantly restore accuracy when the adversarial attack was aggressive (✏ = 0.01). We recommend that an adversarial defense method be used in conjunction with a deepfake detector to improve the accuracy of predictions. APE-GAN and MagNet are effective methods in the deepfake context, but their effectiveness is limited when the adversarial attack is aggressive.