Optimising Visual Clarity using Clustering Techniques for Overcrowded Biplots

dc.contributor.authorBalisa, Yamkela
dc.contributor.supervisorGaney, Raeesa
dc.date.accessioned2025-11-10T16:04:42Z
dc.date.issued2025-06
dc.descriptionA research report submitted in partial fulfilment of the requirements for the degree of Master of Science in Mathematical Statistics, to the Faculty of Science, University of the Witwatersrand, Johannesburg, 2025
dc.description.abstractThe increasing use of data in various industries has driven the need for effective data analysis and visualisation. Data visualisation is a key methodology for extracting insights from the data. One powerful visualisation technique based on dimensionality reduction methods is the biplot. Biplots are multivariate scatterplots that facilitate the visualisation of high-dimensional data by projecting it onto lower dimensional spaces, usually two or three dimensions. This reduction in dimensionality is achieved using techniques such as Principal Component Analysis (PCA) for continuous data. A biplot simultaneously represents both samples and variables within the same visualisation. However, biplots often face challenges when dealing with a very large number of variables in data. A key issue is the overcrowding of variables within the biplot, making it difficult to obtain meaningful insights. To address this issue, this study explores the integration of unsupervised learning techniques, specifically clustering into the biplot framework. Unsupervised learning refers to a type of machine learning approach in which the algorithm learns patterns and relationships in the data without prior knowledge of the expected output. Clustering, a fundamental unsupervised learning technique, involves grouping similar data points into clusters, enabling the identification of underlying structures and relationships. By applying clustering, specifically the k-means clustering algorithm, this study aims to cluster similar variables into distinct clusters within the biplot. Similar variables are determined by the proximity of their endpoints and the angles they form within the biplot. Ultimately, the refined biplot displays only a representative cluster of vectors, thus enhancing the clarity and interpretability.
dc.description.submitterMMM2025
dc.facultyFaculty of Science
dc.identifier0000-0001-8253-0096
dc.identifier.citationBalisa, Yamkela. (2025). Optimising Visual Clarity using Clustering Techniques for Overcrowded Biplots. [Master's dissertation, University of the Witwatersrand, Johannesburg]. WIReDSpace. https://hdl.handle.net/10539/47477
dc.identifier.urihttps://hdl.handle.net/10539/47477
dc.language.isoen
dc.publisherUniversity of the Witwatersrand, Johannesburg
dc.rights©2025 University of the Witwatersrand, Johannesburg. All rights reserved. The copyright in this work vests in the University of the Witwatersrand, Johannesburg. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of University of the Witwatersrand, Johannesburg.
dc.rights.holderUniversity of the Witwatersrand, Johannesburg
dc.schoolSchool of Statistics and Actuarial Science
dc.subjectHigh-dimensional data
dc.subjectBiplots
dc.subjectPrincipal component analysis
dc.subjectClustering
dc.subjectK-means algorithm
dc.subjectUCTD
dc.subject.primarysdgSDG-9: Industry, innovation and infrastructure
dc.subject.secondarysdgSDG-4: Quality education
dc.titleOptimising Visual Clarity using Clustering Techniques for Overcrowded Biplots
dc.typeDissertation

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Balisa_Optimising_2025.pdf
Size:
3.66 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.43 KB
Format:
Item-specific license agreed upon to submission
Description: