Statistical approaches for classifying & defining areas in South Africa as "urban" or "rural"

Date
2007-10-10T07:37:16Z
Authors
Laldaparsad, Sharthi
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The purpose of this research report is to utilise appropriate statistical (both non-spatial and spatial) techniques to classify areas in the country into urban and rural. These areas, as derived by means of each statistical method, are profiled and common characteristics amongst them are summarised for classification and definition of urban and rural areas. Population data for these areas were aggregated to determine the overall urbanisation for the country. The methodology utilised was that of supervised classification. Two sample data sets of areas that are known with certainty to be urban or rural were derived and used consistently throughout the study. The importance of utilising areas of known urban and rural status was firstly to identify essential patterns or predominant characteristics from areas that are known, and thereafter to apply similar characteristics to areas that are not known or are ambiguous, in order to classify them as either urban or rural. Sample 1 comprises all areas in the country with formal and informal urban settlements, as well as formal rural areas, i.e. farms. Sample 2 is similar to sample 1, but in addition it includes areas falling under the jurisdiction of traditional authorities, known as tribal areas, which were classed as known rural. Non-spatial techniques, namely linear logistic regression, classification trees and discriminant analysis, as well as spatial techniques, namely straight-majority-rule and iterated conditional modes (ICM), were researched, applied and analysed for both samples, for each province and for South Africa as a whole, using the 2001 South African population census data. Comparisons were made with the 1996 census information. All three non-spatial statistical methods gave insight into those census variables and their combinations that best describe the subject under research, i.e. urban and rural. All three methods identified significant variables that clearly separate urban and rural areas. The results of all three non-spatial statistical methods showed similarities within each sample, but differences were noted between the two samples. All three nonspatial statistical methods applied to sample 1 classified the majority of the tribal EAs (Enumeration Areas) as urban, whilst the results from sample 2 are very similar to those obtained from both censuses, since both censuses and sample 2 predefine tribal settlements as rural.
Description
Keywords
urban, rural
Citation
Collections