UNIVERSITY OF THE WITWATERSRAND M.SC. COMPUTER SCIENCE BY DISSERTATION Applying Machine Learning to Model South Africa’s Equity Market Index Price Performance Author: Tshepo Chris NOKERI Supervisors: Dr. Ritesh AJOODHA and Mr. Rudzani MULAUDZI A thesis submitted in fulfillment of the requirements for the degree of M.Sc. Computer Science by Dissertation in the School of Computer Science and Applied Mathematics July 14, 2023 https://www.wits.ac.za http://www.tshepochris.com https://www.riteshajoodha.co.za/ https://www.linkedin.com/in/rudzanimulaudzi https://www.linkedin.com/in/rudzanimulaudzi https://www.wits.ac.za/csam/ ii Declaration of Authorship I, Tshepo Chris NOKERI, declare that this thesis titled, “Applying Machine Learning to Model South Africa’s Equity Market Index Price Performance” and the work presented in it, are my own. I confirm that: • This work was done wholly or mainly done while in candidature for the M.Sc. Com- puter Science by Dissertation degree in the School of Computer Science and Applied Mathematics at the University of the Witwatersrand. • Where any part of this thesis has previously been submitted for a degree or any other qualification at the University of the Witwatersrand or any other institution, this has been clearly stated. • Where I have consulted the published work of others, this is always clearly attributed. • Where I have quoted from the work of others, the source is always given. Except for such quotations, this thesis is entirely my own work. • I have acknowledged all main sources of help. • Where thesis is based on work done by myself jointly with others, I have made clear exactly what was done by others and what I have contributed myself. Signature: Date: July 14, 2023 iii UNIVERSITY OF THE WITWATERSRAND Abstract Faculty of Science School of Computer Science and Applied Mathematics M.Sc. Computer Science by Dissertation Applying Machine Learning to Model South Africa’s Equity Market Index Price Performance by Tshepo Chris NOKERI Policymakers typically use statistical multivariate forecasting models to forecast the reaction of stock market returns to changing economic activities. However, these models frequently result in subpar performance due to inflexibility and incompetence in modeling non-linear re- lationships. Emerging research suggests that machine learning models can better handle data from non-linear dynamic systems and yield outstanding model performance. This research compared the performance of machine learning models to the performance of the benchmark model (the vector autoregressive model) when forecasting the reaction of stock market re- turns to changing economic activities in South Africa. The vector autoregressive model was used to forecast the reaction of stock market returns. It achieved a mean absolute percentage error (MAPE) value of 0.0084. Machine learning models were used to forecast the reac- tion of stock market returns. The lowest MAPE value was 0.0051. The machine learning model trained on low economic data dimensions performed 65% better than the benchmark model. Machine learning models also identified key economic activities when forecasting the reaction of stock market returns. Most research focused on whole features, few models for comparison, and barely focused on how different feature subsets and reduced dimension- ality change model performance, a limitation this research addresses when considering the number of experiments. This research considered various experiments, i.e., different feature subsets and data dimensions, to determine whether machine learning models perform better than the benchmark model when forecasting the reaction of stock market returns to changing economic activities in South Africa. HTTPS://WWW.WITS.AC.ZA https://www.wits.ac.za/science/ https://www.wits.ac.za/csam/ iv Contents Declaration of Authorship ii Abstract iii 1 Introduction to the Research 1 1.1 Introduction to the Research . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Purpose Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.4 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.5 Research Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.6 Research Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.7 Research Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 Literature Review 6 2.1 The Stock Market in South Africa . . . . . . . . . . . . . . . . . . . . . . . 7 2.1.1 The Financial Times Stock Exchange/Johannesburg Stock Exchange All-Share Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 The use of Asset Pricing Models in Forecasting the Reaction of Stock Market Returns to Changing Economic Activities . . . . . . . . . . . . . . . . . . . 8 2.2.1 The Capital Asset Pricing Model . . . . . . . . . . . . . . . . . . . . 8 2.2.2 The Arbitrage Pricing Model . . . . . . . . . . . . . . . . . . . . . . 8 2.2.3 The Multi-Factor Model . . . . . . . . . . . . . . . . . . . . . . . . 9 The Estimation of Alpha and Systematic Risk Factors . . . . . . . . 9 The Estimation of Cumulative Systematic Risk Factors . . . . . . . . 9 2.2.4 The Modern Portfolio Model . . . . . . . . . . . . . . . . . . . . . . 10 2.2.5 The Drawbacks of Asset Pricing Models when Forecasting the Reac- tion of Stock Market Returns to Changing Economic Activities . . . . 11 2.2.6 The Rationale for Comparing the Performance of Statistical and Ma- chine Learning Models . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.3 The use of Conventional Statistical Models in Forecasting the Reaction of Stock Market Returns to Changing Economic Activities . . . . . . . . . . . . 12 2.3.1 Statistical Univariate Forecasting Models . . . . . . . . . . . . . . . 12 The Autoregressive Integrated Moving Average Model . . . . . . . . 12 The Seasonal Autoregressive Integrated Moving Average Model . . . 13 2.3.2 The Statistical Multivariate Forecasting Model . . . . . . . . . . . . 13 v The Vector Autoregressive Model . . . . . . . . . . . . . . . . . . . 13 2.3.3 The Drawbacks of Conventional Statistical Forecasting Models when Forecasting the Reaction of Stock Market Returns to Changing Eco- nomic Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.4 State of Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.4.1 Related Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.5 Research Gaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.6 The use of Machine Learning Models in Forecasting the Reaction of Stock Market Returns to Changing Economic Activities . . . . . . . . . . . . . . . 20 2.6.1 The Ordinary Least-Squares Regression Model . . . . . . . . . . . . 20 2.6.2 The Decision Tree Model . . . . . . . . . . . . . . . . . . . . . . . 21 2.6.3 The Random Forest Tree Model . . . . . . . . . . . . . . . . . . . . 21 2.6.4 The Extreme Gradient Boosting Tree Model . . . . . . . . . . . . . . 22 2.7 The use of Neural Networks in Forecasting the Reaction of Stock Market Returns to Changing Economic Activities . . . . . . . . . . . . . . . . . . . 22 2.7.1 The Recurrent Neural Network . . . . . . . . . . . . . . . . . . . . . 23 2.7.2 The Gated Recurrent Unit Network . . . . . . . . . . . . . . . . . . 23 2.7.3 The Long-Short Term Memory . . . . . . . . . . . . . . . . . . . . . 24 2.7.4 The Restricted Boltzmann Machine . . . . . . . . . . . . . . . . . . 24 2.7.5 The Multi-Layer Perceptron . . . . . . . . . . . . . . . . . . . . . . 25 2.8 The Activation Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.8.1 The Rectified Linear Unit Activation Function . . . . . . . . . . . . 25 3 Research Methods 27 3.1 The Acquisition of Economic and Stock Market Data of South Africa . . . . 27 3.1.1 Sources of Economic and Stock Market Data of South Africa . . . . 28 3.1.2 Properties of Economic and Stock Market Data of South Africa . . . 28 3.1.3 Strategies used to Cleanse and Preprocess Data Before Forecasting . . 30 3.2 The Strategy used to Impute Missing Economic Data of South Africa . . . . 31 3.2.1 The k Nearest Neighbor Imputation Strategy . . . . . . . . . . . . . 31 3.3 The Strategy used to Replace Outliers in Economic and Stock Market Re- turns Data of South Africa . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.3.1 The Median Outlier Replacement Strategy . . . . . . . . . . . . . . . 32 3.4 The Strategy used to Reduce Economic Data Dimensions of South Africa . . 32 3.4.1 The Principal Components Analysis Method . . . . . . . . . . . . . 32 3.5 The Strategy used to Partition Economic and Stock Market Returns Data of South Africa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.5.1 The Random Data Partitioning Strategy . . . . . . . . . . . . . . . . 33 3.6 The Strategy used to Scale Economic Data of South Africa . . . . . . . . . . 33 3.6.1 The Data Standardization Scaling Strategy . . . . . . . . . . . . . . 34 3.7 Experiments for Model Comparison . . . . . . . . . . . . . . . . . . . . . . 34 3.8 Strategies used to Regularize Regression Models . . . . . . . . . . . . . . . 36 3.8.1 The Ridge Model Regularization Strategy . . . . . . . . . . . . . . . 36 3.8.2 The Least Absolute Shrinkage and Selection Operator Model Regu- larization Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 3.8.3 The Elastic Net Model Regularization Strategy . . . . . . . . . . . . 37 3.9 The Metric used to Evaluate the Performance of Models . . . . . . . . . . . 38 3.9.1 The Mean Absolute Percentage Error Metric . . . . . . . . . . . . . 38 3.10 Ethical Consideration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.11 Research Methods Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4 Experiment Results & Discussions 40 vi 4.1 The Exploration of the Distribution of Stock Market Returns Data in South Africa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.1.1 The Granger-Causal Relationship between Economic Activities and Stock Market Returns in South Africa . . . . . . . . . . . . . . . . . 41 4.2 The Benchmark for Model Comparison . . . . . . . . . . . . . . . . . . . . 42 4.2.1 The Performance of the Vector Autoregressive Model when Fore- casting the Reaction of Stock Market Returns to Changing Economic Activities in South Africa . . . . . . . . . . . . . . . . . . . . . . . 42 4.3 The Performance of Default Machine Learning Models when Forecasting the Reaction of Stock Market Returns to Changing Economic Activities in South Africa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.3.1 The Performance of Default Regression Models when Forecasting the Reaction of Stock Market Returns to Changing Economic Activ- ities in South Africa . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.3.2 The Performance of Default Tree-Based Models when Forecasting the Reaction of Stock Market Returns to Changing Economic Activ- ities in South Africa . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.3.3 The Performance of Default Neural Networks when Forecasting Re- action of Stock Market Returns to Changing Economic Activities in South Africa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.4 The Selection of a Feature Subset Containing Key Economic Features based on the Gini Impurity Value Calculated by Tree-based Models . . . . . . . . . 46 4.4.1 The Selection of a Feature Subset Containing Key Economic Fea- tures based on the Gini Impurity Value Calculated by the Decision Tree Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.4.2 The Selection of a Feature Subset Containing Key Economic Fea- tures based on the Gini Impurity Value Calculated by the Random Forest Tree Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.4.3 The Selection of a Feature Subset Containing Key Economic Fea- tures based on the Gini Impurity Value Calculated by the Extreme Gradient Boosting Tree Model . . . . . . . . . . . . . . . . . . . . . 47 4.4.4 The Performance of Models Trained on a Feature Subset Contain- ing Key Economic Features when Forecasting the Reaction of Stock Market Returns to Changing Economic Activities in South Africa . . 48 4.5 The Reduction of Economic Data Dimensions . . . . . . . . . . . . . . . . . 49 4.5.1 Economic Features of South Africa in Different Dimensions . . . . . 49 4.5.2 An Index that Offers Insight into the Structure of the Economy in South Africa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.5.3 The Performance of Models Trained on Low Economic Data Di- mensions when Forecasting the Reaction of Stock Market Returns to Changing Economic Activities in South Africa . . . . . . . . . . . 51 4.6 Overall Performance Summary . . . . . . . . . . . . . . . . . . . . . . . . . 52 5 Conclusions & Future Research 55 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 5.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 5.3 Research Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 5.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 5.5 Policy Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 5.6 Learned Model Application . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 5.7 Future Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 vii A A List of Economic Features used Forecast the Reaction of Stock Market Re- turns in South Africa 59 B A Full Index that Offers Insight Into the Structure of the Economy in South Africa 69 C Coefficients of the Optimal Model 73 Bibliography vii viii List of Figures 2.1 The reaction of stock market returns to changing economic activities, i.e., (A) economic activities, (B) international economic activities, (C) money and banking activities, (D) capital markets activities, and (E) national government finance activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.1 The modeling pipeline exhibiting the workflow that produces models that forecast the reaction of stock market returns to changing economic activities . 27 4.1 The distribution of stock market returns data in South Africa . . . . . . . . . 40 4.2 The loss function value across epochs of default neural networks when fore- casting the reaction of stock market returns to changing economic activities in South Africa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.3 The top ten economic features in a feature subset selected by the decision tree model, as measured by the gini impurity value . . . . . . . . . . . . . . 46 4.4 The top ten economic features in a feature subset selected by the random forest tree model, as measured by the gini impurity value . . . . . . . . . . . 47 4.5 The top ten economic features in a feature subset selected by the extreme gradient boosting tree model, as measured by the gini impurity value . . . . . 47 4.6 Economic features of South Africa in two dimensions . . . . . . . . . . . . . 49 4.7 Economic features of South Africa in three dimensions . . . . . . . . . . . . 50 4.8 The top performance of each machine learning model over four periods . . . 52 4.9 The top performance of each tree-based feature selection strategy . . . . . . . 54 4.10 The learning curve of the optimal model that forecasts the reaction of stock market returns to changing economic activities in South Africa . . . . . . . . 54 ix List of Tables 1.1 Research structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1 A summary of used economic features and their categories . . . . . . . . . . 7 2.2 The drawbacks of asset pricing models when forecasting the reaction of stock market returns to changing economic activities . . . . . . . . . . . . . . . . 11 2.3 The drawbacks of conventional statistical forecasting models when forecast- ing the reaction of stock market returns to changing economic activities . . . 14 2.4 Related research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.1 Sources of economic and stock market data of South Africa, along with the description of data and the sample period . . . . . . . . . . . . . . . . . . . 28 3.2 The properties of economic and stock market data of South Africa, and com- mentary on how those properties affect models that forecast the reaction of stock market returns to changing economic activities, along with strategies for addressing non-adherence to model requirements . . . . . . . . . . . . . 29 3.3 Strategies used to cleanse and preprocess data strategies before forecasting the reaction of stock market returns to changing economic activities in South Africa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.4 Experiments considered when determining whether machine learning models perform better than the benchmark model when forecasting the reaction of stock market returns to changing economic activities in South Africa . . . . . 34 3.5 Research methods summary . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.1 The descriptive statistics of stock market returns data in South Africa . . . . . 41 4.2 The Granger-Causality relationship between selected economic activities and the stock market returns in South Africa . . . . . . . . . . . . . . . . . . . . 41 4.3 Selected economic features in the Granger-Causality Matrix . . . . . . . . . 42 4.4 The performance of the vector autoregressive model when forecasting the reaction of stock market returns to changing economic activities in South Africa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.5 The performance of default regression models when forecasting the reaction of stock market returns to changing economic activities in South Africa over four periods: 3 months (H1), 6 months (H2), 12 months (H3), and 24 months (H4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 x 4.6 The performance of default tree-based models when forecasting the reaction of stock market returns to changing economic activities in South Africa over four periods: 3 months (H1), 6 months (H2), 12 months (H3), and 24 months (H4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.7 The performance of default neural networks when forecasting the reaction of stock market returns to changing economic activities in South Africa over four periods: 3 months (H1), 6 months (H2), 12 months (H3), and 24 months (H4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.8 The performance of models trained on a feature subset containing key eco- nomic features when forecasting the reaction of stock market returns to chang- ing economic activities in South Africa over four periods: 3 months (H1), 6 months (H2), 12 months (H3), and 24 months (H4) . . . . . . . . . . . . . 48 4.9 An index that provides insight into economic activities in South Africa . . . . 50 4.10 The performance of models trained on low economic data dimensions when forecasting the reaction of stock market returns to changing economic activ- ities in South Africa over four periods: 3 months (H1), 6 months (H2), 12 months (H3), and 24 months (H4) . . . . . . . . . . . . . . . . . . . . . . 51 4.11 The top ten highest-performing machine learning models when forecasting the reaction of stock market returns to changing economic activities in South Africa over four periods: 3 months (H1), 6 months (H2), 12 months (H3), and 24 months (H4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.12 The top ten worst-performing machine learning models when forecasting the reaction of stock market returns to changing economic activities in South Africa over four periods: 3 months (H1), 6 months (H2), 12 months (H3), and 24 months (H4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.13 The ranking of economic features based on the gini impurity value calculated by each tree-based model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 5.1 Research questions, various experiments, and research findings . . . . . . . . 56 A.1 A list of economic features used forecast the reaction of stock market returns in South Africa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 B.1 A full index that offers insight into the structure of the economy in South Africa 69 C.1 Coefficients of the the optimal model (the ridge model) . . . . . . . . . . . . 73 1 1 Introduction to the Research 1.1 Introduction to the Research The Johannesburg Stock Exchange (JSE) is legitimately regarded as the primary stock market area in Africa, because it was the 17th largest stock exchange in the world in 2017 [SSEInitia- tive 2021; JSELtd 2021] and the largest stock market exchange in Africa, with a total market value of US$1.36 trillion and 339 listed stocks from various industries [SSEInitiative 2021]. Consequently, forecasting the returns of stocks listed on the JSE is important, but doing so for all stocks remains challenging given the listed stock count. Information about historical stock market index returns of South Africa is critical for eco- nomic policymakers and investors, but it is insufficient for developing judgments, since it lacks present predictions about the future of the stock market index. Hence, policymakers rely on predictive analytical models to model the data creation process and create economic growth predictions, before developing and implementing policies that accomplish strategic economic objectives. The principal objective of investors is to avert financial loss while maximizing predicted asset returns. However, this objective may be unaccomplished due to economic risk consid- erations. To attain this objective, investors focus heavily on mitigating unfavorable economic risk factors and capitalizing on favorable economic risk variables. Policymakers also re- quire substantial information to develop policies that eliminate systematic market mistakes, conduct monetary and fiscal interventions for economic stability, and promote equity market- friendly policies. Previous research on the reaction of stock market returns to changing economic activities in South Africa used conventional asset pricing models and statistical forecasting models that contained methodological difficulties and practical issues, necessitating the use of sophisti- cated analytical tools like machine learning models. Furthermore, there is little understanding of how machine learning methods compare to traditional forecasting models in anticipating the reaction of stock market returns in the modern economic setting. This study looks at the dynamics of the economy and how stock market returns respond to changes in economic activity in South Africa. It considers several experimental situations, such as alternative feature sets, data dimensions, and model parameters, to test if machine learning algorithms outperform the benchmark model (the vector autoregressive model) in predicting stock market returns. 2 This research is valuable for investors in the development phase of investment strategy man- agement, as well as policymakers who want to identify appropriate analytical tools for an- swering critical practices and policy questions about economic activities and the reaction of stock market returns in South Africa. 1.2 Problem Statement Policymakers typically use statistical multivariate forecasting models to forecast the reaction of stock market returns to changing economic activities, then use insights from models to structure economic policies that reduce systematic market errors and benefit the stock market [Olayode et al. 2021; Saxena and Bhadauriya 2021; Shamsudin et al. 2021]. Previous research on the reaction of stock market returns to changing economic activities focused on leading economies like the U.S. economy [Bordo and Jeanne 2002; Stadtmann and Dunsch 2018]. However, research focused on emerging economies has stalled and is cur- rently focused on using conventional forecasting models [Macfarlane 2011; Tsepang Patrick 2013; Chifurira and Chinhamu 2019]. Prominent research used asset pricing models to understand the general equilibrium of ex- change [Malamud 2015], some research used conventional statistical univariate forecasting models [Pillay 2020; Chitenderu et al. 2014; Makatjane and Moroke 2021], and most re- search repeatedly used a common statistical multivariate forecasting model, i.e., the vector autoregressive model [Adejayan and Oke 2022], which is the benchmark model in this re- search due to its widespread adoption. Forecasting the reaction of stock market return to changing economic activities using statisti- cal multivariate forecasting models remains a challenging undertaking due to inflexibility and incompetence in modeling non-linear relationships [Pratiwi et al. 2021]. On the other hand, Fu et al. [2018]; Wong et al. [2020]; Kamalov [2020], among others, suggested the adoption of machine learning models, because these models can better handle data from non-linear dynamic systems and yield outstanding model performance. There is currently limited use of machine learning to forecast the reaction of stock market returns in the emerging economic context [Ramos-Pérez et al. 2021; Klibanov et al. 2021; Makatjane and Moroke 2021]. Furthermore, most research focused on whole features, few models for comparison, and barely focused on how different feature subsets and reduced dimensionality change model performance, a limitation this research addresses when consid- ering the number of experiments. This research established that machine learning models can be used to forecast the reaction of stock market returns to changing economic activities in the emerging economic context. It advances research by comparing various models trained on different feature subsets and economic data dimensions. 1.3 Purpose Statement This research investigated the dynamics of the economy and how stock market returns react to changing economic activities in South Africa. It considered various experiments, i.e., dif- ferent feature subsets and data dimensions, to determine whether machine learning models perform better than the benchmark model (the vector autoregressive model) when forecasting the reaction of stock market returns. Not only that, but it also produced an index that offers 3 insight into the structure of the economy in South Africa. 1.4 Research Questions This research responded to four crucial research questions: 1. How do stock market returns react to changing economic activities in South Africa? 2. Do machine learning models perform better than the benchmark model (the vector au- toregressive model) when forecasting the reaction of stock market returns to changing economic activities in South Africa, as measured by the mean absolute percentage error (MAPE) metric? 3. Stock market returns react differently to changing economic activities based on the magnitude of change. Do models trained on a feature subset containing key economic features selected based on the gini impurity value calculated by tree-based models perform better than the models trained on whole features when forecasting the reaction of stock market returns to changing economic activities in South Africa, as measured by the MAPE metric? 4. Is the performance of models trained on low economic data dimensions (found by the principal component analysis method) distinguishable from the performance of models trained on high economic data dimensions when forecasting the reaction of stock market returns to changing economic activities in South Africa, as measured by the MAPE metric? 1.5 Research Contributions This research makes the following research contributions: • Determine whether machine learning models perform better than the benchmark model when forecasting the reaction of stock market returns to changing economic activities in South Africa across various experiments, as measured by the MAPE metric. • Present key economic features in South Africa when forecasting the reaction of stock market returns based on the gini impurity value calculated by tree-based models. • Provide insight into the structure of the economy in South Africa by producing an index that ranks economic activities based on the explained variance ratio value calculated by the principal components analysis method. • Show how different feature subsets and data dimensions change the MAPE value of models when forecasting the reaction of stock market returns to changing economic activities in South Africa. 1.6 Research Motivation This study is significant because it analyzes how changes in economic activity affect stock market returns in South Africa. It increases public awareness of the technical measurements used in statistical and machine learning methodologies to forecast stock market index returns. Furthermore, innovative analytical solutions for increasing the performance of stock market index return prediction models. This thesis contributes to the body of knowledge on stock market returns. Insights into the economic causes of stock market returns may be valuable to policymakers, because they may 4 better understand how their policies impact the market. Practitioners may also gain insight into the stock market’s behavioral and important tendencies for making investment judg- ments. 1.7 Research Structure Table 1.1 shows the research structure. TABLE 1.1: Research structure Number Chapter Heading Chapter Function 1 Introduction This chapter states the research problem (section 1.2), the research purpose (section 1.3), research questions (section 1.4), and research contributions (section 1.5). 2 Literature Review This chapter covers the stock market in South Africa (section 2.1), highlights primary asset pricing models for forecasting stock market returns (section 2.2) and statistical univariate and multivariate forecasting mod- els commonly used to forecast stock market returns (subsection 2.3.1 and subsection 2.3.2), then presents the drawbacks of using asset pricing models and statis- tical forecasting models (subsection 2.2.5 and subsec- tion 2.3.3). This chapter equally refers to the state of literature and related research (section 2.4), then reveals candidate machine learning models for addressing the drawbacks of using asset pricing models and statistical forecasting models (section 2.6, and section 2.7) 3 Research Methods This chapter covers sources and properties of eco- nomic and stock market data of South Africa (sub- section 3.1.1 and subsection 3.1.2), then proceeds to specify strategies for data imputation (section 3.2), out- lier replacement (section 3.3), data dimension reduction (section 3.4), data partitioning (section 3.5), data scal- ing (section 3.6), and model regularization (section 3.8). The chapter concludes by revealing the metric used to evaluate the performance of candidate models (sec- tion 3.9). Continued on next page 5 Table 1.1 – Continued from previous page Number Chapter Heading Chapter Function 4 Experiment Results & Discus- sions This chapter shows exploratory descriptive statistical results of stock market returns in South Africa (sec- tion 4.1), produces an index that offers insight into the structure of the economy in South Africa (subsec- tion 4.5.2), reports the performance of the benchmark model when forecasting the reaction of stock markets to changing economic activities in South Africa, as mea- sured by the MAPE value (section 4.2), then report the performance of machine learning models when fore- casting the reaction of stock markets to changing eco- nomic activities in South Africa, as measured by the MAPE value (section 4.3 and subsection 4.3.3). The last segment of the chapter determines how differ- ent feature subsets (subsection 4.4.4) and data dimen- sions (subsection 4.5.3) change the MAPE value of can- didate models. 5 Conclusions & Future Work This chapter recaps the reviewed literature (section 5.2), covers the used research method (section 5.3), summa- rizes experiment results (section 5.4), provides practice and policy recommendations (section 5.5), details the use of the learned model (section 5.6), and highlights the road-map for forthcoming research (section 5.7). 6 2 Literature Review This research adopts the conceptual framework for forecasting the reaction of stock market returns to changing economic activities from previous research by Ahangar et al. [2010]; Jasra et al. [2012]; Ndikum [2020]. (A) FIGURE 2.1: The reaction of stock market returns to changing economic activities, i.e., (A) economic activities, (B) international economic activities, (C) money and banking activities, (D) capital markets activities, and (E) na- tional government finance activities This research differs from previous research on the reaction of stock market returns to chang- ing economic activities, in that it considers various economic features. The reaction of stock market returns in South Africa is the target feature, and economic features are predictor fea- tures (Table A.1 shows a thorough list of used economic features). 7 Table 2.1 shows a summary of used economic features and their categories. TABLE 2.1: A summary of used economic features and their categories Economic Feature Category Indices: consumer price, producer price, domestic min- ing and quarrying activities, domestic manufacturing, etc. Rates: interest, currency exchange, yields, bonds, secu- rities, etc. Services: electricity, fuels, gas, supplies, water, etc. Goods: net trade and food, etc. Economic category Net average daily turnover, South African Reserve Bank (SARB) gross reserves in foreign currency, gold and other foreign reserves, etc. International economic cat- egory Money supply, credit, deposits and advances, liabilities, investment treasury bills, short-term credit, return on equity, etc. Money and banking cate- gory Traded shares value, fixed interest securities market, non-resident transactions, equity derivative markets, etc. Capital markets category National government revenue, expenditure, borrowing, financing of net borrowing requirements, etc. National government fi- nance category 2.1 The Stock Market in South Africa The Johannesburg Stock Exchange (JSE) is the official stock market exchange of South Africa. It was the 17th largest stock exchange in the world in 2017, as measured by the market capitalization [SSEInitiative 2021; JSELtd 2021]. It is also the largest exchange in Africa, with a total market value of US$1.36 trillion and 339 listed stocks across diverse sectors [SSEInitiative 2021]. It is reasonable to consider South Africa the primary stock market district in Africa. Fore- casting returns of stocks listed in the JSE is valuable, but doing so for all stocks remains a challenging task, provided the listed stock count. This research focuses solely on the reaction of stock market index returns (Financial Times Stock Exchange/Johannesburg Stock Exchange (FTSE/JSE) all-share index returns) to chang- ing economic activities in South Africa. 2.1.1 The Financial Times Stock Exchange/Johannesburg Stock Exchange All- Share Index The FTSE/JSE all-share index represents the benchmark performance of stocks listed in the JSE for 99.9% of the total market value. The benchmark performance is calculated using the market capitalization-weighted index method. Equation 2.1 defines the market capitalization- weighted index method: MCPWI = w1 × p1 + w2 × p2, ..., wn × pn, (2.1) 8 where w denotes the weight share price and p denotes the share price. Equation 2.2 defines the weight share prices: wn = MCi n ∑ i=1 MCi , (2.2) where MCi denotes the total market capitalization. 2.2 The use of Asset Pricing Models in Forecasting the Reaction of Stock Market Returns to Changing Economic Activities The adoption of analytical tools for forecasting stock market returns and developing invest- ment strategies is a long-standing phenomenon [Burdenko 2017; Ndikum 2020]. Research on stock market returns in South Africa repeatedly used asset pricing models [Carter et al. 2017]. The capital asset pricing model values assets (subsection 2.2.6), then anticipates the future cash flow of assets [Munk 2013]. This model focuses on the general equilibrium of exchange in the market [Malamud 2015]. Alternatives to the capital asset pricing model, i.e., the arbitrage pricing model (subsec- tion 2.2.2) and multi-factor model (subsection 2.2.6), consider systematic economic risk fac- tors, while the modern portfolio model considers portfolio diversification (subsection 2.2.6) 2.2.1 The Capital Asset Pricing Model The capital asset pricing model identifies the weight of the expected asset return, the asset return as a function of the risk-free return and risk-premium, along with the discounted rate of the net present value [Lintner 1965; Mossin 1966]. Reddy and Thomson [2011] used the capital asset pricing model to explain expected excess stock market returns and determine the linkage between expected stock market returns and the beta in South Africa. 2.2.2 The Arbitrage Pricing Model The arbitrage pricing model uses the linear-oriented framework to identify the extent to which systematic economic risk factors influence the expected asset return [Huberman 2005]. Equa- tion 2.3 defines the arbitrage pricing model: ER(x) = R f + β1Rp1 + β1Rp2 ·, βnRpn , (2.3) where ER(x) denotes the expected asset return, R f denotes the risk-free asset return, βn denotes the sensitivity of the asset price to fluctuations in systematic economic risk factors, and Rpn denotes a risk-premium emanating from systematic economic risk factors. Muzindutsi and Niyimbanira [2012] used the arbitrage pricing model to determine the ex- posure of the returns of the top forty performing stocks listed on the JSE to a systematic 9 economic risk factor, the exchange rate. 2.2.3 The Multi-Factor Model The multi-factor model uses the arbitrage pricing model as a baseline asset pricing model to identify the extent to which expected asset returns react to systematic economic risk factors, i.e., the inflation rate, interest rate, and economic cycle, among other systematic economic risk factors [Fama and French 1993]. The model equally considers market uncertainty, along with individual and joint variability among assets (or portfolios). Equation 2.4 defines the multi-factor model: Ri = E (Ri) + βi1 + F1 + βi2 + F2+, ...,+βik + Fk + ϵi, (2.4) where Ri denotes the asset return, ERi denotes the expected asset return, βi1 denotes the sensitivity of the asset or portfolio return to fluctuations in systematic economic risk factors, Fk denotes the systematic economic risk factor, and ϵi denotes market uncertainty. Mukoyi and Ogujiuba [2022] compared the performance of various multi-factor models (i.e., the Fama and French three-factor model, Carhart four-factor model Fama and French five- factor model) when forecasting the reaction of stock market returns in the resource sector, industrial sector, and financial sector of South Africa to investment style risk. The Estimation of Alpha and Systematic Risk Factors Stock market research frequently considers the α (alpha) and β (beta) of the portfolio by arranging a security characteristic line in the linear model (Equation 2.5). Ri − R f = αiβi(RM − Ri) + ϵi, (2.5) where Ri denotes the realized portfolio return, RM denotes the market return, R f denotes the risk-free return, αi denotes the alpha of the portfolio, and βi denotes the beta of the portfolio. The Estimation of Cumulative Systematic Risk Factors The linear function considers systematic economic risk factors, idiosyncratic risk, and the expected portfolio return, along with transaction cost. Equation 2.6 defines the linear func- tion: f (h) = 1 2 κhT t QTQht + 1 2 κhT t Sht − αTht + (ht−1) TΛ(ht−1), (2.6) where κ denotes a risk aversion factor. The multi-factor model equally considers market neutrality, along with the position size and diversification of the portfolio. Equation 2.7 defines the gradient (slope coefficient) in the linear model: f ′ (h) = 1 2 κ(2QTQh) 1 2 κ(2Sh)− α + 2(ht−1)Λ. (2.7) 10 Each common risk considers a systematic economic risk factor in the linear model. Equa- tion 2.8 defines common risk: Crisk = 1 2 hT t βFββTht. (2.8) The multi-factor model considers the portfolio return variance, diminishes the variance in proximity to zero, and vector estimates of α. Not only that, but the model equally considers the response of the asset price to the market impact for each currency unit exchanged. Equation 2.9 transforms β to Q to diminish matrix expansion: Crisk = 1 2 hT t QTQht. (2.9) Equation 2.10 defines the linear impact model: N ∑ i=1 = λ(i ,t) ( h(i ,t) − h(i ,t−1) )2 , (2.10) where λ(i ,t) = 1 10 × ADV(i ,t) , (2.11) where ADV(i ,t) denotes the average daily volume traded for each 10 basis point. 2.2.4 The Modern Portfolio Model The modern portfolio model considers portfolio return volatility a risk proxy, along with its expected return volatility and return weights. Not only that, but the model also confirms the statistical dependence among assets and identifies the efficient frontier of the portfolio [Elton and Gruber 2018]. Equation 2.12 estimates the expected portfolio return: E(Rp) = n ∑ i=1 wiE (Ri) , (2.12) where Rp denotes the portfolio return, Ri denotes the expected asset return, and wi denotes the weights of the asset return. Equation 2.13 defines the portfolio return variance: σ2 p = n ∑ i=1 w2 i σ2 i + n ∑ i=1 n ∑ j ̸=i wiwjσiσj pi j , (2.13) where σi denotes the extent to which the asset return deviates from the mean asset return, wi denotes the weights of the asset return, and pi j denotes the statistical dependence among assets. Equation 2.14 defines the portfolio return volatility: σp = √ σ2 p . (2.14) 11 Equation 2.15 defines the expected portfolio return containing two assets (i.e., asset A and B): E(Rp) = wAE (RA)wBE(RB) = wAE (RA) + (1 − wA) E (RB) . (2.15) Equation 2.16 defines the portfolio return variance: σ2 p = w2 Aσ2 A + w2 Bσ2 B + 2wAwBσAσB pA B . (2.16) Taljaard and Maré [2021] used the modern portfolio model to identify changes in the concen- tration of market capitalization-weights in the top forty performing stocks listed on the JSE. 2.2.5 The Drawbacks of Asset Pricing Models when Forecasting the Reaction of Stock Market Returns to Changing Economic Activities This research acknowledges the importance of asset pricing models and their contribution to our understanding of forecasting the reaction of stock market returns to changing economic activities. However, these models have certain drawbacks (Table 2.2). TABLE 2.2: The drawbacks of asset pricing models when forecasting the reaction of stock market returns to changing economic activities Model Drawback The capital asset pricing model The capital asset pricing model is criticized for its plainness and impracticality, along with its failure to consider the statistical signifi- cance between systematic economic risk fac- tors and the expected asset return [Muthama et al. 2014; Andrei et al. 2018]. The arbitrage pricing model Prevailing research on the reaction of stock market returns to changing economic activities hold that exploiting the expected asset return should not remain the sole focus of investors. Alternatively, they should equally consider the asset return volatility or the portfolio return volatility [Sinha 2016; Khudoykulov 2017]. The modern portfolio model The concern of investors about the down- side risk, which represents the financial risk of losses from investing in assets or a port- folio, is unrealized by the modern portfolio model [Otuteye and Siddiquee 2017; Crack and Grieves 2017]. The model equally does not consider a scenario, whereby the expected portfolio return exceeds the actual portfolio re- turn [Hou et al. 2017]. 12 2.2.6 The Rationale for Comparing the Performance of Statistical and Ma- chine Learning Models The previous section discussed the most conventional models for forecasting stock market re- turns, namely the capital asset pricing model (), the arbitrage pricing model (lsec:TheArbitragePricingModel), the multi-factor model (), and the modern portfolio model (). Furthermore, the the inadequa- cies of these models in stock market forecasting. Furthermore, the current portfolio model was undiscussed in depth since this study does not focus on several South African stocks, but rather on the stock market index, which includes all South African stocks. In conclusion, the typical asset pricing methods outlined above are unable to manage the complexity of the data set employed in this study. This dissertation examined the effective- ness of statistical models and machine learning models in forecasting the impact of economic activity on stock market returns. 2.3 The use of Conventional Statistical Models in Forecasting the Reaction of Stock Market Returns to Changing Economic Activities Previous research on the reaction of stock markets in South Africa used conventional statisti- cal forecasting models. For instance, Pillay [2020]; Chitenderu et al. [2014]; Makatjane and Moroke [2021] used statistical univariate forecasting model, and Aye et al. [2020]; Ilesanmi and Tewari [2020]; Adejayan and Oke [2022] used statistical multivariate models. 2.3.1 Statistical Univariate Forecasting Models Statistical univariate forecasting models identify predictive patterns of a temporal feature and forecast successive patterns [Babu and Reddy 2014]. These forecasting models spearhead the frontier of research on stock market returns in South Africa [Pillay 2020; Chitenderu et al. 2014; Makatjane and Moroke 2021]. This research covers two conventional statistical forecasting models, i.e., the autoregressive integrated moving average model (section 2.3.1) and seasonal autoregressive integrated mov- ing average model (section 2.3.1), to provide the background of statistical univariate forecast- ing. The Autoregressive Integrated Moving Average Model The autoregressive integrated moving average (ARIMA) (p, d, q) (or Box-Jenkins) model bundles p—the lag k, d of the temporal feature—property changes of the temporal feature, and q—the moving average order, to identify predictive patterns and forecast a temporal feature [Young and Shellswell 1972]. Equation 2.17 defines the ARIMA (p, d, q) model: ŷt − yt−1 = µ̂ + ϕ̂(yt−1 − yt−2), ...,+ϵ̂t, (2.17) where ŷt denotes estimates of the temporal feature at period t, µ denotes unbiased estimates of the temporal feature, and ϵ̂t denotes residuals of the (ARIMA) (p, d, q) model. Mallikarjuna and Rao [2019] used the ARIMA (p, d, q) model, along with other models like the self-exciting threshold autoregressive model, recurrent neural network, and a hybrid 13 model of the ARIMA (p, d, q) model and recurrent neural network, to forecast stock market returns in South Africa. The Seasonal Autoregressive Integrated Moving Average Model The seasonal autoregressive integrated moving average (SARIMA) (p, d, q)× (P, D, Q, s) model amplifies the ARIMA (p, d, q) model with additional parameters (i.e., p and seasonal P, d and seasonal D, q and seasonal Q) [Hyndman and Athanasopoulos 2013]. Equation 2.18 defines the (SARIMA) (p, d, q)× (P, D, Q, s) model: ŷs t = yt − yt−4 + ϵ̂i. (2.18) Equation 2.19 translates Equation 2.18: yS t = (1 − B4)× yt. (2.19) Makatjane and Moroke [2021] used the (SARIMA) (p, d, q)× (P, D, Q, s) model to fore- cast stock market returns in South Africa. 2.3.2 The Statistical Multivariate Forecasting Model Statistical multivariate forecasting models identify predictive patterns of temporal features and forecast successive patterns while considering model residuals. Research on the reac- tion of stock market returns to changing economic activities in South Africa repeatedly used a popular conventional statistical multivariate forecasting model, the vector autoregressive model [Aye et al. 2020; Ilesanmi and Tewari 2020; Adejayan and Oke 2022]. This research considers the vector autoregressive model, the benchmark model, because of its widespread adoption in research. The Vector Autoregressive Model The vector autoregressive model maintains a stochastic system that interprets multiple fea- tures as linear p lag combinations, along with p lags [Gouriéroux et al. 2017]. For clarifica- tion, assume a temporal problem with two features, Equation 2.20 and Equation 2.21 defines the vector autoregressive model: ŷt = α + β̂11yt−1 + β̂12yt−2 + γ̂11x12 + γ̂12xt−2 + ϵ̂1t, (2.20) xt = α + β̂21yt−1 + β̂22yt−2 + γ̂21x22 + γ̂22xt−2 + ϵ̂2t, (2.21) where ŷt denotes k× 1 temporal feature estimates, α̂ denotes n× 1 intercept vector estimates, γ̂i denotes k × k coefficient matrix estimates, and ϵ̂t denotes a serial uncorrelated random vectors order with x̄t = 0 and sum joint variability among temporal features. Pillay [2020] and Macfarlane [2011] used the vector autoregressive model to forecast the reaction of stock market returns to changing economic activities in South Africa. 14 The Structural Vector Autoregressive Model The structural vector autoregressive model bundles Equation 2.20 and Equation 2.21, then identifies causality among temporal features and concludes on the statistical significance among them [Gouriéroux et al. 2017; Tank et al. 2021]. Equation 2.22 defines the struc- tural vector autoregressive model: ŷt = α̂1 + β̂1xt + ϕ̂11yt−1 + ϕ̂12yt−2 + ϕ̂11x12 + ϕ̂12xt−2 + v̂1t, (2.22) xt = α2 + β̂1xt + ϕ̂21yt−1 + ϕ̂22yt−2 + ϕ̂21x22 + ϕ̂22yt−2 + v̂2t. (2.23) 2.3.3 The Drawbacks of Conventional Statistical Forecasting Models when Fore- casting the Reaction of Stock Market Returns to Changing Economic Activities Conventional statistical forecasting models are universally regarded in previous research on the reaction of stock market returns to changing economic activities in South Africa, but they have drawbacks (Table 2.3). TABLE 2.3: The drawbacks of conventional statistical forecasting models when forecasting the reaction of stock market returns to changing economic activities Drawback Description Non-stationarity Temporal features rarely follow a stationary process, and the model residual expansion problem is common in analysis, resulting in non-adherence to some requirements of con- ventional statistical forecasting models. Non-linearity Because temporal features are frequently non-linear, conventional statistical forecasting models are unsuitable for modeling complex data structures, i.e., when the data set contains multiple temporal features with high dimen- sions [Liu et al. 2021; Pahlawan et al. 2021a]. The curse of data dimensionality problem As specified, conventional statistical forecast- ing models are incapable of handling a data set containing multiple temporal features with high dimensions [Ahangar et al. 2010]. Subpar model performance Due to non-adherence of temporal features to conventional statistical forecasting model re- quirements, the model performance tends to be subpar [Zimmerman 1994; Anscombe and Guttman 1960; Pincus 1995]. The machine learning approach addresses the drawbacks of conventional statistical forecast- ing models [BenSaïda and Litimi 2013; Liu et al. 2020; Kennedy et al. 2020]. 15 2.4 State of Literature Policymakers became concerned about the instability in the stock market during and after the global market crisis started in 2008, particularly those in emerging economies [Hedging and Umoetok 2013]. Such occurrences amplified the need for sophisticated analytical tools like machine learning models (section 2.6), because these models can capture complex market data structures [Ramos-Pérez et al. 2021; Liu et al. 2021; Pahlawan et al. 2021a]. 2.4.1 Related Research This research reviews previous research, then identifies related research debates, along with research inconsistencies and gaps it intends to fill, before selecting candidate machine learning models and the research method to use. Table 2.4 shows the model criteria guiding this research, along with feature sets, model specifications, and the performance of models that forecast the reaction of stock market returns to changing economic activities. TABLE 2.4: Related research Author Features Model Specifications Model Performance Ndikum [2020] 200 economic features across multi- ple categories (i.e., the economic cat- egory, and money and banking cat- egory, among other economic cate- gories). Compared to the performance of the restricted Boltzmann machine and the deep belief network when fore- casting the reaction of S&P 500 re- turns to 200 economic features, as measured by the mean squared error (MSE). The restricted Boltzmann machine achieved a mean squared error (MS)E value of 0.36. Whereas, the deep belief network achieved a MSE value of 0.35. Continued on next page 16Table 2.4 – Continued from previous page Author Features Model Specifications Model Performance Ahangar et al. [2010] Data comprised 40 microeconomic features and economic features across multiple categories (i.e., the rates category, and the money and banking category, among other economic categories). 7 economic features were selected based on the explained variance ratio value found by the principal compo- nent analysis method. The deep be- lief network (with three hidden lay- ers and 14 neurons) was used to fore- cast the reaction of stock market re- turns to changing economic activities in Iran. The MSE value for the deep belief network was 31.6. Pahlawan et al. [2021a] Data included 20 economic features spread over multiple economic cat- egories (i.e., the rates category, in- dices category, and money and bank- ing category, among other economic categories). Compared the performance of the re- current neural network, gated recur- rent unit, and long-short term mem- ory when forecasting the reaction of stock market returns (particularly the S&P 500 returns) to changing eco- nomic activities in the U.S., as mea- sured by the MSE. The MSE value for the long-short term memory was 1.35, the MSE value for the gated recurrent unit was 1.55, and the MSE value for the recurrent neural net- work was 1.55. Wong et al. [2020] Data comprises 74 industry-specific features and 102 economic features. The deep belief network (with an on- line early stopping strategy) was used to the reaction of stock market re- turns (particularly the S&P 500 re- turns) to changing industry-specific activities and economic activities in the U.S. The MSE value for the deep belief network was 50.22. Continued on next page 17 Table 2.4 – Continued from previous page Author Features Model Specifications Model Performance Kamalov [2020] Data comprises the S&P 500 price volatility simulation data. Compared the performance of the long-short term memory, multi-layer perceptron, and convolutional neural network when classifying the reac- tion of stock market return volatil- ity (particularly the S&P 500 return volatility) to changing economic ac- tivities in the U.S. The long-short term memory outperformed the multi- layer perceptron and convolutional neural network, with a 0.85 area under the curve value. Xiong et al. [2015] Data contained multiple broad eco- nomic features. The Shapley additive method was used to select features. The long- short term memory was used to fore- cast the reaction of stock market re- turns to a feature subset and whole economic features. The first long-short term memory achieved a MSE value of 2 890, and the second long-short term memory achieved a MSE value of 2 880. Klibanov et al. [2021] Data comprised Russell index returns simulation data. The deep belief network was used to classify Russell index returns. The accuracy value for the deep belief network was 55.42. Ramos-Pérez et al. [2021] Data comprised a few economic fea- tures. Compared the performance of the long-short term memory and multi- layer perceptron when forecasting the reaction of stock market return volatility (particularly S&P 500 re- turn volatility) to changing economic activities in the U.S., as measured by the root mean squared error (RMSE). The RMSE value for the multi-layer perceptron, which was astonishingly near zero, was the highest. Continued on next page 18Table 2.4 – Continued from previous page Author Features Model Specifications Model Performance Fu et al. [2018] Data comprised 244 technical and fundamental features. Compared the performance of deep belief networks (with different struc- tures) when forecasting the reaction of stock market returns (particularly S&P 500 returns) to changing techni- cal features and economic activities in the U.S., as measured by the mean absolute error (MAE). The MAE values for deep belief networks were 2.98 and 0.97, respectively. Alhomadi [2021] Data comprised 30 macroeconomic features and U.S. stock market in- dices features. Compared the performance of the ordinary least-squares regression model, elastic net model, support vector regression model, random forest model, and extreme gradient boosting model, when forecasting the reaction of stock market returns (particularly the S&P 500 returns) to changing economic activities in the U.S., as measured by the R-Squared. The R-Squared value of the ordinary least-squares re- gression model was 0.1967, the R-Squared value of the elastic net model was 0.4559, the R-Squared value of the support vector regression model was 0.4970, the R- Squared value of the random forest model was 0.3363, and the R-Squared value of the extreme gradient boost- ing model was 0.4215. Nengovhela [2022] Data comprised economic features and stock market indices features. Compared to the performance of the random forest model, k nearest neighbor model, support vector re- gression model, decision tree model, and neural network, when forecast- ing the reaction of stock market re- turns to changing economic activities in South Africa. The MAE value of the random forest model was 0.9609, the MAE value of the k nearest neighbor model was 0.9817, the MAE value of the support vector regression model was 1.1231, the MAE value of the decision tree model was 1.3247, and the MAE value of the neural net- work was 0.9819. Continued on next page 19 Table 2.4 – Continued from previous page Author Features Model Specifications Model Performance Mallikarjuna and Rao [2019] Data comprised stock market indices features of developed, emerging, and frontier economies. Compared the performance of dif- ferent models (i.e., the ARIMA (p, d, q) model, self-exciting thresh- old autoregressive model, recurrent neural network, singular spectrum analysis model, and a hybrid model of the ARIMA (p, d, q) model and recurrent neural network) when fore- casting stock market returns in South Africa, as measured by the RMSE. The RMSE of theARIMA (3, 0, 1) model was 1.062336, the RMSE of the self-exciting threshold au- toregressive model was 1.064449, the RMSE of the re- current neural network was 1.063028, the RMSE of the singular spectrum analysis model was 1.066819, and the RMSE of the hybrid model was 1.061791. Table 2.4 shows most research focused on whole features, few models for comparison, and barely focused on how different feature subsets and reduced dimension- ality change model performance, a limitation this research addresses when considering the number of experiments. This research established that machine learning models can be used to forecast the reaction of stock market returns to changing economic activities in the emerging economic context. It advances research by comparing various models with different feature subsets and data dimensions. 20 2.5 Research Gaps While more preliminary research has examined how economic variables impact stock market returns in South Africa, only a few studies have looked into the performance differences between statistical models and machine learning models. Those that did so concentrated on sophisticated economies, employed restricted experimental scenarios, and made out-of- sample forecasts over a single time horizon. This study, on the other hand, focuses on comparing the effectiveness of statistical models and machine learning models in projecting the reaction of stock market returns to changing economic activity in the developing economic scenario. This study is unusual in that it tested several models trained on distinct feature subsets and economic data components over varied time periods. 2.6 The use of Machine Learning Models in Forecasting the Re- action of Stock Market Returns to Changing Economic Ac- tivities The machine learning approach denotes a practical systematic approach that maps out logi- cal steps for sophisticated computer systems to learn tasks T based on experience [Liu 1996; Mitchell 1997; Kennedy et al. 2020]. This approach advances on task T using knowledge of prior performance P estimates, identifies complex data structures, and enhances the gen- eralization capacity [Dietterich 1996], which is useful for forecasting the reaction of stock market returns to changing economic activities. This research used the ordinary least-squares regression model, decision tree model, random forest tree model, extreme gradient boosting tree model, recurrent neural network, gated recurrent unit network, long-short term memory, restricted Boltzmann machine, and multi- layer perceptron. 2.6.1 The Ordinary Least-Squares Regression Model The ordinary least-squares regression model learns xi (predictor features) and predicts yi (a target feature) while deflating ϵ̂i (ordinary least-squares regression model residuals). Equa- tion 2.24 defines the ordinary least-squares regression model: ŷi = β̂0 + β̂1x1 + ϵ̂i, (2.24) where ŷi denotes predicted yi estimates, β̂0 (the y-intercept) denotes ȳi, where xi = 0, and β̂i (a slope coefficient) denotes the path of corresponding changes between xi and ŷi. Equation 2.25 defines β̂0: β̂1 = n ∑ i=1 (xi − x̄) (yi − ȳi) n ∑ i=1 (xi − x̄i) 2 , (2.25) where x̄i denotes the mean value of xi and ȳi denotes the mean value of yi. Equation 2.26 defines ϵ̂i: 21 ϵ̂i = yi − ŷi. (2.26) Equation 2.24 with more xi results in Equation 2.27: ŷi = β̂0 + β̂1x1 + β̂2x2 + β̂3x3, ...,+ϵ̂i. (2.27) [Marozva 2020] used the ordinary least-squares regression model to forecast the reaction of stock market return volatility to changing economic activities and political activities in South Africa. Equally, Mpofu [2011] used the model to forecast the reaction of stock market return to the manufacturing index and prime overdraft rate, among other features. 2.6.2 The Decision Tree Model The decision tree model adopts a recursive partitioning strategy to isolate feature values. Successively, the model reduces the impurity using the gini impurity estimator and splits nodes while inflating the entropy (Equation 2.29) [Moore II 1987]. Equation 2.28 defines the gini impurity estimator: f̂ (xi) = 1 − c ∑ i=1 p̂2 j , (2.28) where pj denotes class c of the node in the sample proportion. By dividing feature values into manageable chunks, the entropy estimator isolates homogeneous feature values of nodes and then identifies irregularities. Equation 2.29 defines the entropy estimator: f̂ (xi) = − c ∑ i=1 ( log2 p̂j ) . (2.29) For Equation 2.29, pj ̸= 0, provided class c is unfilled. Entropy = 0 for feature values is similar to the class c of the node in the sample proportion. Nengovhela [2022] used decision tree models to forecast the reaction of stock market returns to changing economic activities in South Africa. 2.6.3 The Random Forest Tree Model The random forest tree model unifies decision tree models produced through a random pro- cess to enhance model performance by using the loss minimization approach at multiple iterations [Vijayakumar and Cheung 2018]. Equation 2.30 defines the random forest tree model: ŷi = 1 N n ∑ i=1 f̂ ( x ′ i ) , (2.30) where f̂ ( x ′ i ) denotes the function (a linear function for this research). 22 Nengovhela [2022] used random forest tree models to forecast the reaction of stock market returns to changing economic activities in South Africa. 2.6.4 The Extreme Gradient Boosting Tree Model The extreme gradient boosting tree model bundles hollow decision tree models to enhance model performance by using the loss minimization approach at multiple iterations [Hastie et al. 2009; Nokeri 2021]. This involves evaluating the performance of decision tree models and referring to the previous subpar model performance, while considering decision tree model residuals from previous iterations [Mason et al. 1999]. Equation 2.31 learns xi: ŷi = m̂i (xi) + ϵ̂1. (2.31) Equation 2.32 investigates the ϵ̂i dependency: ϵ̂3 = ĥi (xi) + ϵ̂3. (2.32) Equation 2.33 bundles the regressed ϵ̂i in Equation 2.32: ŷi = m̂i (xi) + ĝi (xi) + ĥi (xi) + ϵ̂3. (2.33) Equation 2.34 completes ŷi: ŷi = α × m̂i (xi) + β̂i × ĝi (xi) + γ̂ × ĥi (xi) + ϵ̂4. (2.34) Alhomadi [2021] used the extreme gradient boosting tree model to forecast the reaction of stock market returns to changing economic activities in the U.S. No research used the model to forecast the reaction of stock market returns in South Africa. 2.7 The use of Neural Networks in Forecasting the Reaction of Stock Market Returns to Changing Economic Activities Neural networks are distinct machine learning model classes that are a replica of animals’ biological neural networks. A neural network accumulates xi (predictor features) in the input layer (the first layer in a neural network), then use f̂ (xi) (an activation function) to identify complex data structures and route feature values to successive hidden layers (layers between the input layer and the output layer in a neural network), designate dissimilar wi (weights) and βi (biases), then use f̂ (xi) in the output layer (the last layer in a neural network) to forecast subsequent yi (predicted target feature estimates). This research considers a subset of machine learning models, acknowledged as neural net- works, i.e., the recurrent neural network (subsection 2.7.1), gated recurrent unit (subsec- tion 2.7.2), long-short term memory (subsection 2.7.3), restricted Boltzmann machine (sub- section 2.7.4), and multi-layer perceptron (subsection 2.7.5). 23 2.7.1 The Recurrent Neural Network The recurrent neural network accumulates xt, then uses f̂ (xt) to identify xt and ĥt−1 (pre- vious hidden states), designate distinctive wt, and forecast yt (target feature values) and successive ĥt using the tangenthyperbolic activation function. Equation 2.35 defines the recurrent neural network: ĥt = tanh(wh × ht−1 × wx × xt), (2.35) where ĥt denotes hidden states, wh denotes weights of ht−1 , wx denotes weights of xt, and tanh denotes a tangent hyperbolic activation function that predicts yt restricted to [-1, 1]. Equation 2.36 predicts yt: ŷt = why − ĥt. (2.36) Sako et al. [2022] used the recurrent neural network to forecast the reaction of stock market returns to the exchange rate (ZAR/USD) in South Africa. Mallikarjuna and Rao [2019] compared the performance of different models (i.e., the ARIMA (p, d, q) model, self-exciting threshold autoregressive model, recurrent neural network, sin- gular spectrum analysis model, and a hybrid model of the ARIMA (p, d, q) model and re- current neural network) when forecasting stock market returns in South Africa, among other countries, as measured by the root mean squared error. The research found that all candidate models outperform the recurrent neural network. 2.7.2 The Gated Recurrent Unit Network The gated recurrent unit uses f̂ (xt) to learn xt and ht−1 , and contains a forget gate that f orgets invaluable xt and ht−1 . The neural network equally contains a reset gate (or an update gate) [Chung et al. 2014], whereby t and h = 0: zt = σi (wz[ht−1 , xt]) , (2.37) rt = σi (wr[ht−1 , xt]) , (2.38) ĥt = tanh (wr[rtht−1 , xt]) , (2.39) ĥt = (1 − zt)× ( wo[ht−1 + zt + ĥt] ) , (2.40) where zt denotes an update gate, rt denotes a reset gate, ĥt denotes hidden states, xt denotes predictor features, tanh denotes a tangent hyperbolic activation function, and wt denotes the weight matrix. Sako et al. [2022] compared the performance of various neural networks (i.e., the gated re- current unit, among other neural networks like the recurrent neural network and long-short term memory) when forecasting the reaction of stock market returns to the exchange rate, the 24 South African rand to the U.S. dollar. 2.7.3 The Long-Short Term Memory The long-short term memory accumulates xt and uses f̂ (xt) in an update gate to conclude by evoking some xt and ht−1 , before routing the rest to cell state vectors of ht−1 that use f̂ (xt) to forecast yt [Hochreiter and Schmidhuber 1997]. f̂ (x)t = σ(W f [ht−1 , xt] + β̂ f ), (2.41) it = σ(wi[ht−1 , xt] + β̂i), (2.42) x̃t = tanh(wi[ht−1 , xt] + β̂C), (2.43) c̃t = ft × Ct−1 + it + ct, (2.44) ot = σ(wo[ht−1 , xt] + β̂0), (2.45) ĥt = ot × tanh(ct), (2.46) where xt denotes predictor features, ft denotes the activation vector of the forget gate, ĥt denotes predicted hidden states, it denotes the activation vector of input gate or update gate, x̃t denote cell state vectors of c̃t, and wt denotes a weight matrix, and β̂C denotes the bias of xt in cell state vectors. Balusik et al. [2021] compared the performance of the long-short term memory to the per- formance of the (SARIMA) (p, d, q)× (P, D, Q, s) model when forecasting stock market returns in South Africa, as measured by the RMSE metric. The research found the long-short term memory outperforms the (SARIMA) (p, d, q)× (P, D, Q, s) model. 2.7.4 The Restricted Boltzmann Machine The restricted Boltzmann machine denotes an abstract neural network with v̂i—a visible layer and ĥj—a hidden layer attached to wi , j . Equation 2.47 defines the energy function of v̂i and ĥj: E (vi, hi) = n ∑ i=1 αivi − n ∑ i=1 β̂ jhj − n ∑ i=1 n ∑ j=1 v̂iwi , j ĥj, (2.47) where αi denotes weights and biases of v̂i, and bj denotes biases of v̂j. Equation 2.48 trans- lates Equation 2.47. E ( v̂i, ĥi ) = −αTvi − β̂ j T ĥj − v̂i Twi , j ĥj. (2.48) 25 da Costa and Gebbie [2020] used the restricted Boltzmann machine stacked with auto-encoders to forecast the stock market price in South Africa. 2.7.5 The Multi-Layer Perceptron The multi-layer perceptron maintains an input layer and output layer, along with two lay- ers between them (hidden layers) at most. This feed-forward network operates the back- propagation learning approach to estimate the gradient of the loss function, along with its weights. Equation 2.49 accumulates and converts xi and designates dissimilar wj i and β̂ j, then routes them to an initial hidden layer: f̂ (xi) = β̂ j + n ∑ i=1 wi j xi. (2.49) Equation 2.50 uses φm in an initial hidden layer, which incrementally routes feature values to successive hidden layers: φm = [1 + exp ( f (xi))] −1. (2.50) Equation 2.51 uses an activation function to forecast yi, which are treated as xi by successive hidden layers: ŷi = β̂ j + n ∑ j=1 wj φj. (2.51) Ataman and Kahraman [2021] compared the performance of the ordinary least-squares model, multi-layer perceptron, and a hybrid model (integrating the ordinary least-squares model and multi-layer perceptron) when forecasting the reaction of stock market returns to changing economic activities in Brazil, Russia, India, China, and South Africa, as measured by the R- Squared metric. The research found the hybrid model outperforms the ordinary least-squares model and multi-layer perceptron. 2.8 The Activation Function To forecast yi (target feature values), f̂ (xi) (an activation function) first accumulates xi and identifies complex data structures, then predicts subsequent yi. There are multiple activation functions (i.e., the sigmoid activation function, tangent hyperbolic activation function, and rectified linear unit activation function, among other activation functions). 2.8.1 The Rectified Linear Unit Activation Function This research uses a rectified linear unit (relu) activation function at layers of neural networks, because it does not bound ŷi, since it produces ŷi that range from 0 to ∞ (infinity) after accumulating xi and identifying their predictive patterns. Equation 2.52 defines the relu activation function: Relu = max(0, xi). (2.52) 26 The relu activation function acquires feature values (and/or hidden states) from the input layer and absorbs them, then routes them to concurrent hidden layers, which learns predictive patterns and attaches varying biases and weights. In the last hidden layer, the relu activation function directs predictor feature values (and/or hidden state) to the output layer, which predicts concealed target feature values (and/or con- currenthidden states). 27 3 Research Methods This research adopts a research method from previous research. Figure 3.1 shows the model- ing pipeline exhibiting the workflow that produces models that forecast the reaction of stock market returns to changing economic activities. FIGURE 3.1: The modeling pipeline exhibiting the workflow that produces models that forecast the reaction of stock market returns to changing eco- nomic activities 3.1 The Acquisition of Economic and Stock Market Data of South Africa This research extracts 54 monthly economic features from the South African Reserve Bank (SARB) database, and 99 monthly economic features from the Federal Reserve Economic Data (FRED) database, because those were available monthly features in the specified databases. 28 It extracts the monthly FTSE/JSE all-share index price feature from the JSE database. The sample period was selected because the Financial Times Stock Exchange/Johannesburg Stock Exchange (FTSE/JSE) all-share index was established in 2002 by the FTSE and JSE. For some unexplained reason, the majority of the factors addressed in this research began to be covered around 2002. The sample period will now begin in 2002. 3.1.1 Sources of Economic and Stock Market Data of South Africa Table 3.1 shows sources of economic and stock market data of South Africa, along with the description of data and the sample period. TABLE 3.1: Sources of economic and stock market data of South Africa, along with the description of data and the sample period Source Description Features Sample Period FRED A regional reserve bank that is part of the United States Cen- tral Bank with headquarters in Washington, D.C. is known as the Federal Reserve Bank of St. Louis 1. 99 monthly changing economic features (predictor features) sourced from the FRED database. 2002 - 2022 SARB SARB is the Central Bank of South Africa 2. 54 monthly changing economic features (predictor features) sourced from the SARB database 2002 - 2022 JSE JSE is the official stock ex- change in South Africa. A single monthly temporal fea- ture, the FTSE/JSE all-share in- dex price (the target feature) sourced from the SARB 3. 2002 - 2022 The Fred database combines data from many sources, including Statistics South Africa (Stats SA), the Organization for Economic Co-operation and Development (OCED), and the South African Reserve Bank (SARB). Some variables are unavailable in the SARB database. As a result, the FRED database was utilized as a supplementary data source to collect South African economic information. 3.1.2 Properties of Economic and Stock Market Data of South Africa Table 3.2 shows properties of economic and stock market data of South Africa, and commen- tary on how those properties affect models that forecast the reaction of stock market returns 1The Federal Reserve Bank of St. Louis periodically maintains economic data of many nations, along with changing economic activities, in a database, acknowledged as FRED. 2The principal SARB mandate is to structure and influence South African economic expansionary policies, and produce banknotes and coins, among other mandates. 3The FTSE/JSE all-share index represents the benchmark performance of equities in South Africa for 99.9% of the total market capitalization. 29 to changing economic activities, along with strategies for addressing non-adherence to model requirements. The purpose of identifying the properties was to identify which research tech- niques to use. TABLE 3.2: The properties of economic and stock market data of South Africa, and commentary on how those properties affect models that forecast the reaction of stock market returns to changing economic activities, along with strategies for addressing non-adherence to model requirements Data Properties Ve ct or A ut or eg re ss iv e L in ea r R eg re ss io n Tr ee -b as ed N eu ra lN et w or k Comments Non-stationarity *** *** Certain economic features are not stationary. To adhere to the vector autoregressive model stationary requirement. This research used an order differentiation strategy to produce eco- nomic features that proceed linearly and pre- vent the model residual expansion problem. Few economic feature values *** *** * *** Economic features contain a few feature val- ues, which makes it challenging to identify complex predictive patterns using neural net- works, as neural networks are intended for large data sets. Missing economic fea- ture values * While neural networks and the decision tree model do not hold rigid model requirements regarding missing values, the vector autore- gressive model and ordinary least-squares re- gression model hold rigid requirements. The k nearest neighbor imputation strategy imputes missing economic feature values. High dimensions of economic activities * *** ** *** The vector autoregressive and ordinary-least squares regression model cannot identify com- plex structures in economic and stock mar- ket data and generalize unknown observations when there are high dimensions of economic activities. The principal components analy- sis method reduces the dimensions of eco- nomic activities of South Africa into meaning- ful eigenvectors. Different scales and quantities of economic features *** *** *** *** Economic features come in different sizes and quantities. The data standardization scaling strategy scales changing economic activities before modeling. Legend 30 * Poor data modeling capacity. ** Regular data modeling capacity. *** Optimal data modeling capacity. 3.1.3 Strategies used to Cleanse and Preprocess Data Before Forecasting Table 3.3 shows strategies used to cleanse and preprocess data strategies before forecasting the reaction of stock market returns to changing economic activities in South Africa. TABLE 3.3: Strategies used to cleanse and preprocess data strategies before forecasting the reaction of stock market returns to changing economic activ- ities in South Africa Strategy Description Redundant economic fea- tures removal There were duplicate economic features present in orig- inal data sets extracted from the Federal Reserve Eco- nomic Data (FRED) database and South African Re- serve Bank (SARB) database. Duplicate features were removed before combining data sets for redundancy avoidance in analysis. Data labeling Economic features extracted from the FRED database had economic features codes as column names. Column names were renamed to their official economic feature names. Data concatenation Economic features came from different data sources (i.e., the FRED database, SARB database, and JSE database). Economic features were concatenated to complete the data set. Indexing The index (date) of Economic features extracted from the FRED database. Stock market returns had a format that was incompatible with platform and model require- ments. The format was set to a Year-Month-Day format. Sample period specification Economic features had varying sample periods. Sam- ple periods were set over 22 years (from 2002 to 2022), because the FTSE/JSE all-share index price was estab- lished in 2002 by the FTSE and JSE. Separator conversion Economic features extracted from the SARB database contained comma separators. The conversion from a comma separator to a decimal point separator, so fea- tures adhere to platform and model requirements. Infinity conversion Certain economic features contained negative and pos- itive infinity values of a float type. Infinity values of a float type were converted to NaN (missing feature val- ues), then replaced using the k nearest neighbor data imputation strategy. Feature stationarity by or- der differencing Features were made stationary using the order differen- tiation strategy. Continued on next page 31 Table 3.3 – Continued from previous page Strategy Description Data imputation Economic features contained missing feature values. The k nearest neighbor data imputation strategy re- placed missing feature values with values in proximity to them. Outlier replacement Economic features had varying sample periods. In this manner, features contained outliers. The median outlier replacement strategy replaced outliers with the median value. Data partitioning The random data partitioning strategy ensured models learn economic activities and forecast the reaction of stock market returns in South Africa. This strategy was adopted with varying split ratios to compare the perfor- mance of models over four periods. Data dimension reduction The principal components analysis method reduced di- mensions of economic activities in South Africa into meaningful eigenvectors. The principal component analysis method produced an index that offers insight into the structure of the economy in South Africa based on the explained variance ratio. Data standardization Economic features came in different scales and quanti- ties. The data standardization scaling strategy contains economic activities on a standard scale. 3.2 The Strategy used to Impute Missing Economic Data of South Africa Most machine learning models cannot handle missing values [Rubin 1975], hence Leke and Marwala [2019]; Thulare et al. [2021] suggested the use of data imputation strategies. This research used the k nearest neighbor imputation strategy to replace missing values (subsec- tion 3.2.1). 3.2.1 The k Nearest Neighbor Imputation Strategy Previous research suggested the use of the k nearest neighbor imputation strategy to impute missing values of economic features with values in proximity using the Euclidean distance (d) method [Zhang 2012; Mulaudzi and Ajoodha 2020]. Equation 3.1 defines the Euclidean d method: d (xi, yi) = √ n ∑ i=1 (xi − yi)2, (3.1) where d denotes the distance between feature values, n denotes the number of xi, whereas xi and yi denote feature values in a Euclidean space. 32 3.3 The Strategy used to Replace Outliers in Economic and Stock Market Returns Data of South Africa Outliers pose a problem in analysis, because some models are sensitive to outliers. The re- search uses the median outlier replacement strategy to replace outliers with the median value (subsection 3.3.1). 3.3.1 The Median Outlier Replacement Strategy The median outlier replacement strategy uses the median value to replace outliers (i.e., xi < 5% percentile or xi > 95% percentile) in the sample probability distribution. Equation 3.2 defines the median (M), where n is odd: M = n + 1 2 , (3.2) where n denotes the sample size. Equation 3.3 defines M, where n is even: M = ( n 2 ) + ( n 2 + 1) 2 . (3.3) 3.4 The Strategy used to Reduce Economic Data Dimensions of South Africa The curse of data dimensionality problem frequently leads to an upsurge in the amount of power required for computation [Young 2020]. This research used the principal compo- nents analysis method to reduce dimensions of economic activities in South Africa (subsec- tion 3.4.1). 3.4.1 The Principal Components Analysis Method The principal component analysis method, an unsupervised machine learning method, re- duces feature sets into meaningful eigenvectors. This method is suitable for addressing the curse of data dimensionality problem [Shen 2009]. Equation 3.4 defines the principal com- ponent analysis method: n ∑ i=1 1 m (xt − s̄) (xi − x̄)T, (3.4) where x̄ emanates from eigenvectors. Interestingly, the research ranks economic activities based on the explained variance ratio (σ2 i ) value to produce an index that offers insight into the structure of the economy in South Africa. Equation 3.5 defines σ2 i : σ2 i = n ∑ i=1 v̂ (x̃i) v̂ (xi) , (3.5) 33 where v̂ denotes the explained variance ratio. A scree plot identifies the number of compo- nents needed to complete the principal component analysis method. The starting point of eigenvalue misrepresentation defines the n component selection criterion. 3.5 The Strategy used to Partition Economic and Stock Market Returns Data of South Africa This research used the random data partitioning strategy to partition data, so each feature value has an odd chance of being included in the selection (subsection 3.5.1). It considers four split ratios to identify the performance of machine learning models over four periods. 3.5.1 The Random Data Partitioning Strategy The random data partitioning strategy designates feature values to either the training or test data set through a probabilistic process. Estimating the population proportion establishes the confidence interval. Equation 3.6 defines the population proportion: p̂i ± z n ∑ i=1 p̂i (1 − p̂i) n , (3.6) where p̂i denotes sample proportion estimates, n denotes the sample size, and z denotes the critical value. Equation 3.7 defines the mean value of the sample proportion: µ̂i = p̂i. (3.7) Equation 3.8 defines σ ( p̂i): σ ( p̂i) = n ∑ i=1 p̂i (1 − p̂i) n . (3.8) Equation 3.9 defines the confidence interval (CI) of the sample proportion: CI = p̂i ( −z < n ∑ i=1 p̂i (1 − p̂i) n < z ) , (3.9) where 0 < CI < 1 and ± z denotes critical values. 3.6 The Strategy used to Scale Economic Data of South Africa Machine learning models require features to fit in a scale before modeling, so objective func- tions perform calculations efficiently. This research used the data standardization scaling strategy to designate feature values in the training data set into a standard scale to ensure values are contained in a standard scale for apt model comparison (subsection 3.6.1). 34 3.6.1 The Data Standardization Scaling Strategy The data standardization scaling strategy alters the probability distribution, whereby µi = 0, σi = 1, and xi is in proximity to 0, so features proceed in a Gaussian process (or follow a normal distribution) [Tabak 2004]. Equation 3.10 defines the data standardization scaling strategy: Scalestandard = n ∑ i=1 xi − x̂i σi , (3.10) where xi denotes feature values, and σ (xi) denotes the divergence of xi from x̄i. Equa- tion 3.11 defines σi: σ (xi) = √ ∑n i=1 (xi − x̄)2 n − 1 . (3.11) 3.7 Experiments for Model Comparison Table 3.4 shows various experiments considered for model comparison. TABLE 3.4: Experiments considered when determining whether machine learning models perform better than the benchmark model when forecasting the reaction of stock market returns to changing economic activities in South Africa Scenario Description Pr in ci pa lC om po ne nt A na ly si s R eg re ss io n Tr ee -b as ed N eu ra lN et w or k Ve ct or A ut or eg re ss iv e Continued on next page 35 Table 3.4 – Continued from previous page Scenario Description Pr in ci pa lC om po ne nt A na ly si s R eg re ss io n Tr ee -b as ed N eu ra lN et w or k Ve ct or A ut or eg re ss iv e Scenario 1 Investigate the dynamics of the economy and how stock market returns react to changing economic activities in South Africa using the default vector autoregressive model, along with default supervised machine learn- ing models (i.e., the ordinary least-squares regression model, ridge model, least absolute shrinkage and se- lection operator model, elastic net model, decision tree model, random forest tree model, extreme gradient boosting tree model, recurrent neural network, gated re- current unit, long-short term memory, restricted Boltz- mann machine, and multi-layer perceptron). Following that, compare the performance of default su- pervised machine learning models against the perfor- mance of the benchmark model (the default vector au- toregressive model) when forecasting the reaction of stock market returns to changing economic activities in South Africa, as measured by the MAPE metric. * * * * Scenario 2 Present feature subsets containing key economic fea- tures selected based on the gini impurity value calcu- lated by tree-based models. Concurrently, compare the performance of models trained on a feature subset to the performance of models trained on whole features when forecasting the reaction of stock market returns to changing economic activities in South Africa, as measured by the MAPE metric. * * * Scenario 3 Produce low economic data dimensions found by the principal component analysis method. Subsequently, compare the performance of models trained on low economic data dimensions to the per- formance of models trained on high economic data di- mensions when forecasting the reaction of stock market returns to changing economic activities in South Africa, as measured by the MAPE metric. * * * * 36 3.8 Strategies used to Regularize Regression Models Machine learning models operate a ι1 norm function or ι2 norm function (or both) to confine model parameters by introducing a penalty term. The ridge model regularization strategy used the ι2−norm function, least absolute shrinkage and selection operator model regulariza- tion strategy used the ι1−norm function, and elastic net model regularization strategy used the ι2−norm function and ι2−norm function. 3.8.1 The Ridge Model Regularization Strategy The ridge model regularization strategy, equally known as the ι2 norm model regularization strategy, stabilizes the bias and variance (var) by applying a strengthening parameter similar to weights of βt while learning predictor feature values c. Equation 3.12 defines the ridge model regularization strategy: β̂i = n ∑ i=1 ( ŷi − n ∑ j=1 β̂ jxi j )2 + λ n ∑ j=1 |β̂ j|, (3.12) where λ denotes the penalty term, leading to Equation 3.13: β̂i = ( x ′ ixi + λI )−1 ( x ′ i ŷi ) . (3.13) The penalty term penalizes a machine learning model, as it commits errors when learning predictor feature values and forecasting concealed target feature values. Equation 3.14 de- fines the bias: bias = λ ( x ′ ixi + λI )−1 β̂i. (3.14) Equation 3.15 defines the var: var = λ ( x ′ ixi + λI )−1 , x ′ ixi ( x ′ ixiλI )−1 . (3.15) 3.8.2 The Least Absolute Shrinkage and Selection Operator Model Regular- ization Strategy The least absolute shrinkage and selection operator model regularization strategy, equally known as the ι1 norm model regularization strategy, enhances the performance of machine learning models by normalizing parameters of models and penalizing model residuals [Tib- shirani 1996]. Equation 3.16 defines the least absolute shrinkage and selection operator model regularization strategy: β̂i = n ∑ i=1 ( ŷi − n ∑ j=1 β̂ jxi j )2 + λ n ∑ j=1 ( β̂ j )2 . (3.16) 37 3.8.3 The Elastic Net Model Regularization Strategy The elastic net regularization strategy bundles the λ of the ridge model regularization strat- egy and least absolute shrinkage and selection operator model regularization strategy to man- age bias and var, provided a limited n with higher dimensions [Zou and Hastie 2005a]. It achieves this by eliminating invaluable features and prioritizing noteworthy features, and contains a quadratic function (λ) [Zou and Hastie 2005b; Meier et al. 2008]. Equation 3.17 defines the elastic net model regularization strategy: β̂i = n ∑ i=1 ( ŷt − x ′ i β̂ j )2 2n + λ ( 1 − α 2 n ∑ j=1 β̂i )2 α n ∑ j=1 |β̂ j|, (3.17) where α = 0 for the least absolute shrinkage and selection operator model regularization strategy, and α = 1 for the ridge model regularization strategy. 38 3.9 The Metric used to Evaluate the Performance of Models This research identifies errors models commit when forecasting reactions of stock market returns to evaluate model performance. The mean absolute percentage error metric (sub- section 3.9.1) evaluates the benchmark model (the vector autoregressive model), along with candidate machine learning models, when forecasting the reaction of stock market returns to changing economic activities in South Africa. 3.9.1 The Mean Absolute Percentage Error Metric Compared to other regression model performance metrics (i.e., the mean squared error met- ric, root mean squared error metric, and mean absolute error metric, among other regression model performance metrics), the mean absolute percentage error (MAPE) metric or mean absolute percentage deviation (MAPD) metric is useful in evaluating the performance of regression-based models when the data set contains temporal features, since the metric is intuitive in model interpretation [Swamidass 2000]. The MAPE metric denotes the divergence ratio of ŷi from yi. It does not consider positive divergence ratio values or negative divergence ratio values, and it helps prevent a scenario in which positive errors and negatives do not revoke each other, which is common with alter- native regression-based model performance metrics [Myttenaere et al. 2016]. Equation 3.18 defines the MAPE metric: MAPE = 100% n n ∑ i=1 ∣∣yi − ŷi ŷi ∣∣. (3.18) The MAPE metric serves as the primary model performance metric in this research. A model with a MAPE value in proximity to 100% is ideal. Investors may use the MAPE metric to inform investing decisions. If the MAPE value is incredibly high, they may reconsider investing in the stock market in South Africa or exclude the index from their portfolio. Else, if the MAPE value is exception- ally low, they can consider investing in the stock market in South Africa or include the index in their portfolio. 3.10 Ethical Consideration This research extracts monthly economic features from FRED and SARB databases, and the monthly stock market price in South Africa from the JSE database. All databases are for gen- eral access and academic use. This research does not require an ethics clearance certificate, because all features come from secondary data sources. 39 3.11 Research Methods Summary Table 3.5 shows links between research questions and research methods. TABLE 3.5: Research methods summary Research Question Research Method How do stock market returns react to changing economic activities in South Africa? Do machine learning models perform better than the benchmark model (the vector autoregressive model) when forecasting the reaction of stock mar- ket returns to changing economic ac- tivities in South Africa, as measured by the MAPE metric? Investigate the dynamics of the economy and how stock market returns react to changing economic activities in South Africa using the default vector autoregressive model, along with default supervised machine learn- ing models (i.e., the ordinary least-squares regression model, ridge model, least absolute shrinkage and se- lection operator model, elastic net model, decision tree model, random forest tree model, extreme gradient boosting tree model, recurrent neural network, gated re- current unit, long-short term memory, restricted Boltz- mann machine, and multi-layer perceptron). Following that, compare the performance of default su- pervised machine learning models against the perfor- mance of the benchmark model (the default vector au- toregressive model) when forecasting the reaction of stock market returns to changing economic activities in South Africa, as measured by the MAPE metric. Do models trained on a feature sub- set containing key economic features selected based on the gini impurity value calculated by tree-based mod- els perform better than the models trained on whole features when fore- casting the reaction of stock market returns to changing economic activi- ties in South Africa, as measured by the MAPE metric? Present feature subsets containing key economic fea- tures selected based on the gini impurity value calcu- lated by tree-based models. Concurrently, compare the performance of models trained on a feature subset to the performance of models trained on whole features when forecasting the reaction of stock market returns to changing economic activities in South Africa, as measured by the MAPE metric. Is the performance of models trained on low economic data dimensions (found by the principal compo- nent analysis method) distinguish- able from the performance of mod- els trained on high economic data di- mensions when forecasting the reac- tion of stock market returns to chang- ing economic activities in South Africa, as measured by the MAPE metric? Produce low economic data dimensions found by the principal component analysis method. Subsequently, compare the performance of models trained on low economic data dimensions to the per- formance of models trained on high economic data di- mensions when forecasting the reaction of stock market returns to changing economic activities in South Africa, as measured by the MAPE metric. 40 4 Experiment Results & Discussions This research investigated the dynamics of the economy and how stock market returns react to changing economic activities in South Africa. It considered various experiments, i.e., different feature subsets and data dimensions, to determine whether machine learning models perform better than the benchmark model (the vector autoregressive model) when forecasting the reaction of stock market returns. Download the source code and data of the research project here: http://www.github.com/tshepochris 4.1 The Exploration of the Distribution of Stock Market Returns Data in South Africa Figure 4.1 shows the distribution of stock market returns in South Africa of the same sample period. (A) Series (B) Series Distribution FIGURE 4.1: The distribution of stock market returns data in South Africa Figure 4.1 (A) shows reversion or steady stock market returns that calm the market after extreme lows or highs from 2002 to 2022. (B) shows stock market returns proceed in a Gaussian process. Table 4.1 shows the p value of the augmented Dickey-Fuller (ADF) test, which infers stock market returns stationarity (at α = 0.05), where a p value < 0.05 denotes the stock market re- turns in South Africa are stationary, and p value > 0.05 denotes the feature is non-stationary. http://www.github.com/tshepochris 41 It also shows the central tendency and divergence of stock market returns in South Africa (i.e., the mean value, standard deviation value, skew value, and kurtosis value). TABLE 4.1: The descriptive statistics of stock market returns data in South Africa Mean Std Skew Kurtosis ADF p-value Stock market returns in South Africa 1.0105 0.0113 0.3357 -0.4565 0.0055 Table 4.1 shows the mean stock market re