Evaluation of cluster analysis and latent class analysis in clustering
dc.contributor.author | Murisa, Tatenda | |
dc.date.accessioned | 2020-09-08T12:16:03Z | |
dc.date.available | 2020-09-08T12:16:03Z | |
dc.date.issued | 2019 | |
dc.description | A research report submitted in partial fulfilment of the requirements for the degree of Master of Science to the Faculty of Science, School of Statistics and Actuarial Science, University of the Witwatersrand, Johannesburg, 2019 | en_ZA |
dc.description.abstract | The study compares the performance of latent class, K-means and hierarchical clustering on data with different degrees of cluster overlap. It also assesses how various standardisation methods affect the results of hierarchical and K-means clustering. Several distance and agglomeration methods are evaluated to observe how they perform depending on cluster overlap. Three artificial datasets were simulated whose clusters were poorly, moderately and well separated. These along with the seeds data were run through the three clustering methods. Several external validity indices were calculated for each cluster solution. The adjusted Rand index was used for comparison in the discussion because it is not affected by the number of clusters. Results showed that Ward’s method performed better compared to all other agglomeration methods and the Manhattan distance performed better across the different cluster types in hierarchical clustering. Latent class clustering performed better for poorly and well separated clusters. When the variance of the variables were comparable, K-means clustering with no standardisation performed well. Standardisation by the maximum value and z-score had the best cluster recovery when the variance of variables were large. | en_ZA |
dc.description.librarian | TL (2020) | en_ZA |
dc.faculty | Faculty of Science | en_ZA |
dc.format.extent | Online resource (x, 134 pages) | |
dc.identifier.citation | Murisa, Tatenda Kenneth. (2019). Evaluation of cluster analysis and latent class analysis in clustering. University of the Witwatersrand, https://hdl.handle.net/10539/29564 | |
dc.identifier.uri | https://hdl.handle.net/10539/29564 | |
dc.language.iso | en | en_ZA |
dc.school | School of Statistics and Actuarial Science | en_ZA |
dc.subject.lcsh | Sampling (Statistics) | |
dc.subject.lcsh | Estimation theory | |
dc.subject.lcsh | Error analysis (Mathematics) | |
dc.title | Evaluation of cluster analysis and latent class analysis in clustering | en_ZA |
dc.type | Thesis | en_ZA |
Files
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- TKMurisa #879042 Research_Report.pdf
- Size:
- 2.07 MB
- Format:
- Adobe Portable Document Format
- Description:
- Main Work
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 1.71 KB
- Format:
- Item-specific license agreed upon to submission
- Description: