Electronic Theses and Dissertations (Masters)

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 2 of 2
  • Item
    Predicting Future Stock Price with Sentiment Analysis: Recurrent vs. Attention Based Learning for Regression Tasks
    (University of the Witwatersrand, Johannesburg, 2023-08) Mcdonald, Bernard; Nasejje, Justine
    Stock price prediction is a lucrative challenge as successful prediction could yield significant profits for investors – attracting research utilising novel data sources and modelling techniques. This research aimed to accurately predict the future closing price of the top five stocks of the NASDAQ100 index by leveraging Twitter data and recent advancements in machine learning. Three representations of large-scale Twitter data were derived: company, stock market, and general public sentiment. Company sentiment and stock market sentiment were Granger-causal (p < 0.10) for the closing price of four and two of the five companies considered, respectively. Five stock price prediction models were built: ARIMA, RNN, LSTM, GRU, and a novel Transformer model. A hyperparameter grid search selected feature subsets containing sentiment data as optimal in sixteen of the twenty (80%) model-dataset combinations fitted. Assessed using the RMSE, all the machine learning models outperformed the ARIMA model. The attention-based Transformer model outperformed the recurrent models in both predictive performance and model computational training efficiency. The model produced test RMSEs of 1.22, 2.07, 35.54, 16.61, and 4.95 when predicting the closing price of Apple, Microsoft, Amazon, Alphabet, and Facebook respectively.
  • Item
    Clustering and Classification Techniques in the Presence of Outliers: An Application to the Johannesburg Stock Exchange Stocks
    (University of the Witwatersrand, Johannesburg, 2024) Maphalla, Retsebile; Chipoyera, HW
    In this study, the impact of outliers on clustering using the K-means algorithm was explored. It was observed that a high prevalence of outliers can seriously compromise the results of clustering. A novel algorithm called Clustering-quality-aided outlier detection (CQAOD) is proposed in this study. The novelty stems from the fact that apart from identifying outliers, good quality clustering is achieved and the “optimal” number of clusters for K-means clustering of multivariate Gaussian data is simultaneously proffered. In the case of the Johannesburg Stock Exchange (JSE) data, an investigation to compare the efficacy of the following clustering techniques: Hierarchical clustering, spectral clustering, Clustering Large Applications (Clara), Density-based spatial clustering of applications with noise (DBSCAN) was done with the aim of constructing a diversified stock portfolio. The study found that the hierarchical clustering algorithm is the best algorithm to cluster the shares on the JSE