Evaluation of cluster analysis and latent class analysis in clustering

dc.contributor.authorMurisa, Tatenda
dc.date.accessioned2020-09-08T12:16:03Z
dc.date.available2020-09-08T12:16:03Z
dc.date.issued2019
dc.descriptionA research report submitted in partial fulfilment of the requirements for the degree of Master of Science to the Faculty of Science, School of Statistics and Actuarial Science, University of the Witwatersrand, Johannesburg, 2019en_ZA
dc.description.abstractThe study compares the performance of latent class, K-means and hierarchical clustering on data with different degrees of cluster overlap. It also assesses how various standardisation methods affect the results of hierarchical and K-means clustering. Several distance and agglomeration methods are evaluated to observe how they perform depending on cluster overlap. Three artificial datasets were simulated whose clusters were poorly, moderately and well separated. These along with the seeds data were run through the three clustering methods. Several external validity indices were calculated for each cluster solution. The adjusted Rand index was used for comparison in the discussion because it is not affected by the number of clusters. Results showed that Ward’s method performed better compared to all other agglomeration methods and the Manhattan distance performed better across the different cluster types in hierarchical clustering. Latent class clustering performed better for poorly and well separated clusters. When the variance of the variables were comparable, K-means clustering with no standardisation performed well. Standardisation by the maximum value and z-score had the best cluster recovery when the variance of variables were large.en_ZA
dc.description.librarianTL (2020)en_ZA
dc.facultyFaculty of Scienceen_ZA
dc.format.extentOnline resource (x, 134 pages)
dc.identifier.citationMurisa, Tatenda Kenneth. (2019). Evaluation of cluster analysis and latent class analysis in clustering. University of the Witwatersrand, https://hdl.handle.net/10539/29564
dc.identifier.urihttps://hdl.handle.net/10539/29564
dc.language.isoenen_ZA
dc.schoolSchool of Statistics and Actuarial Scienceen_ZA
dc.subject.lcshSampling (Statistics)
dc.subject.lcshEstimation theory
dc.subject.lcshError analysis (Mathematics)
dc.titleEvaluation of cluster analysis and latent class analysis in clusteringen_ZA
dc.typeThesisen_ZA

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
TKMurisa #879042 Research_Report.pdf
Size:
2.07 MB
Format:
Adobe Portable Document Format
Description:
Main Work

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections