The effect of outliers in model-based clustering using the Expectation Maximization (EM) algorithm
No Thumbnail Available
Date
2020
Authors
Mpogeng, Reatile
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The value and use of data is becoming prominent in bettering society and businesses. In agriculture, enormous amounts of data are being collected through weather fore-casting, remote sensing and geographic information systems. With the use of datamining techniques, this data can be used, for example, in the discovery of information in agriculture (Cebeci and Yildiz, 2015). Business intelligence and analytics are growing to become an imperative field of work for researchers, which extends the need for data-related solutions (Chen et al., 2012). Classification, clustering, regression, outlier detection and correlation, are some of the data mining techniques that can be useful in the understanding of societal and corporate problems through data. Data mining tools are often used to create models that are aimed at representing areal phenomenon. The drawback of these data mining tools is that most of them are based on statistical theory that is based on sets of assumptions which are not necessarily practical. Of interest in this study is clustering algorithms
Description
A research report submitted to the Faculty of Science, University of the Witwatersrand, in partial fulfilment of the requirements for the degree of Master of Science, 2020