Clustering and Classification Techniques in the Presence of Outliers: An Application to the Johannesburg Stock Exchange Stocks

Thumbnail Image

Date

2024

Journal Title

Journal ISSN

Volume Title

Publisher

University of the Witwatersrand, Johannesburg

Abstract

In this study, the impact of outliers on clustering using the K-means algorithm was explored. It was observed that a high prevalence of outliers can seriously compromise the results of clustering. A novel algorithm called Clustering-quality-aided outlier detection (CQAOD) is proposed in this study. The novelty stems from the fact that apart from identifying outliers, good quality clustering is achieved and the “optimal” number of clusters for K-means clustering of multivariate Gaussian data is simultaneously proffered. In the case of the Johannesburg Stock Exchange (JSE) data, an investigation to compare the efficacy of the following clustering techniques: Hierarchical clustering, spectral clustering, Clustering Large Applications (Clara), Density-based spatial clustering of applications with noise (DBSCAN) was done with the aim of constructing a diversified stock portfolio. The study found that the hierarchical clustering algorithm is the best algorithm to cluster the shares on the JSE

Description

A dissertation submitted to the Faculty of Science, University of the Witwatersrand, in partial fulfillment of the requirements for the degree of Master of Science, Johannesburg 2024

Keywords

Clustering, Classification, K-means, Multivariate, CQAOD, UCTD

Citation

Maphalla, Retsebile. (2024). Clustering and Classification Techniques in the Presence of Outliers: An Application to the Johannesburg Stock Exchange Stocks [Master’s dissertation, University of the Witwatersrand, Johannesburg]. WireDSpace.

Endorsement

Review

Supplemented By

Referenced By