Mulaudzi, Rudzani2021-12-082021-12-082021Mulaudzi, Rudzani (2021) A machine learning approach to quantifying and relating the determinants of unemployment in South Africa, University of the Witwatersrand, Johannesburg, <http://hdl.handle.net/10539/32245>https://hdl.handle.net/10539/32245A research report submitted in fulfilment of the requirements for the degree Master of Science in Computer Science to the Faculty of Science, School of Computer Science and Applied Mathematics, University of the Witwatersrand, Johannesburg, 2021Unemployment is a significant problem that South Africa faces. The rate was 30.1% in the first quarter of 2020: placing it amongst the top ten worst unemployment rates in the world. Public policy is a typical instrument used by governments to address unemployment sustainably. It is normally informed by forecasts derived through economic (traditional sta tistical) models. These models are, however, suitable when the data is stationary and linear. The South African unemployment rate, on the other hand, is asymmetric, seasonal, upward trending, and nonstationary. Vector autoregression (VAR), a traditional statistical model, was used to forecast the South African unemployment rate. It resulted in a mean absolute scaled error (MASE) of 41. Comparatively, twelve machine learning models were used to forecast the unemployment rate. The lowest MASE achieved was 0.39. Making the machine learning models 105 times more accurate than the VAR: the benchmark model. Additionally, through feature selection techniques, machine learning approaches enabled the identification of the most impactful features in forecasting the unemployment rate. These features were used to construct a Dynamic Bayesian Network (DBN) to determine how they influence each other and the unemployment rate. The DBN was then used to perform do-Calculus, a data-driven scenario analysis technique. One scenario tested the impact of increasing the GDP on the unemployment rate. This positively impacts the unemployment rate. However, a decline in GDP has a greater negative impact. Therefore, policymakers should avoid, at all costs, a decline in the GDP. This research, therefore, demonstrates the value of machine learning in forecasting the South African unemployment rate (a nonstationary macroeconomic variable) across the broad ma chine learning value chain: feature selection, forecasting, feature influence analysis, and do-Calculus scenario analysis. Previous research tends to only focus on one or two aspects of the value chainOnline resource (121 leaves)enMonetary policy-South AfricaEconometricsA machine learning approach to quantifying and relating the determinants of unemployment in South AfricaThesis