Clustering and Classification Techniques in the Presence of Outliers: An Application to the Johannesburg Stock Exchange Stocks

Date
2024
Journal Title
Journal ISSN
Volume Title
Publisher
University of the Witwatersrand, Johannesburg
Abstract
In this study, the impact of outliers on clustering using the K-means algorithm was explored. It was observed that a high prevalence of outliers can seriously compromise the results of clustering. A novel algorithm called Clustering-quality-aided outlier detection (CQAOD) is proposed in this study. The novelty stems from the fact that apart from identifying outliers, good quality clustering is achieved and the “optimal” number of clusters for K-means clustering of multivariate Gaussian data is simultaneously proffered. In the case of the Johannesburg Stock Exchange (JSE) data, an investigation to compare the efficacy of the following clustering techniques: Hierarchical clustering, spectral clustering, Clustering Large Applications (Clara), Density-based spatial clustering of applications with noise (DBSCAN) was done with the aim of constructing a diversified stock portfolio. The study found that the hierarchical clustering algorithm is the best algorithm to cluster the shares on the JSE
Description
A dissertation submitted to the Faculty of Science, University of the Witwatersrand, in partial fulfillment of the requirements for the degree of Master of Science, Johannesburg 2024
Keywords
Clustering, Classification, K-means, Multivariate, CQAOD, UCTD
Citation
Maphalla, Retsebile. (2024). Clustering and Classification Techniques in the Presence of Outliers: An Application to the Johannesburg Stock Exchange Stocks [Master’s dissertation, University of the Witwatersrand, Johannesburg]. WireDSpace.