UNIVERSITY OF THE WITWATERSRAND

M.SC. COMPUTER SCIENCE BY DISSERTATION

Applying Machine Learning to Model South
Africa’s Equity Market Index Price

Performance

Author:
Tshepo Chris NOKERI

Supervisors:
Dr. Ritesh AJOODHA and Mr.

Rudzani MULAUDZI

A thesis submitted in fulfillment of the requirements
for the degree of M.Sc. Computer Science by Dissertation

in the

School of Computer Science and Applied Mathematics

July 14, 2023

https://www.wits.ac.za
http://www.tshepochris.com
https://www.riteshajoodha.co.za/
https://www.linkedin.com/in/rudzanimulaudzi
https://www.linkedin.com/in/rudzanimulaudzi
https://www.wits.ac.za/csam/


ii

Declaration of Authorship
I, Tshepo Chris NOKERI, declare that this thesis titled, “Applying Machine Learning to
Model South Africa’s Equity Market Index Price Performance” and the work presented in
it, are my own. I confirm that:

• This work was done wholly or mainly done while in candidature for the M.Sc. Com-
puter Science by Dissertation degree in the School of Computer Science and Applied
Mathematics at the University of the Witwatersrand.

• Where any part of this thesis has previously been submitted for a degree or any other
qualification at the University of the Witwatersrand or any other institution, this has
been clearly stated.

• Where I have consulted the published work of others, this is always clearly attributed.

• Where I have quoted from the work of others, the source is always given. Except for
such quotations, this thesis is entirely my own work.

• I have acknowledged all main sources of help.

• Where thesis is based on work done by myself jointly with others, I have made clear
exactly what was done by others and what I have contributed myself.

Signature:

Date: July 14, 2023


iii

UNIVERSITY OF THE WITWATERSRAND

Abstract

Faculty of Science

School of Computer Science and Applied Mathematics

M.Sc. Computer Science by Dissertation

Applying Machine Learning to Model South Africa’s Equity Market Index Price
Performance

by Tshepo Chris NOKERI

Policymakers typically use statistical multivariate forecasting models to forecast the reaction
of stock market returns to changing economic activities. However, these models frequently
result in subpar performance due to inflexibility and incompetence in modeling non-linear re-
lationships. Emerging research suggests that machine learning models can better handle data
from non-linear dynamic systems and yield outstanding model performance. This research
compared the performance of machine learning models to the performance of the benchmark
model (the vector autoregressive model) when forecasting the reaction of stock market re-
turns to changing economic activities in South Africa. The vector autoregressive model was
used to forecast the reaction of stock market returns. It achieved a mean absolute percentage
error (MAPE) value of 0.0084. Machine learning models were used to forecast the reac-
tion of stock market returns. The lowest MAPE value was 0.0051. The machine learning
model trained on low economic data dimensions performed 65% better than the benchmark
model. Machine learning models also identified key economic activities when forecasting
the reaction of stock market returns. Most research focused on whole features, few models
for comparison, and barely focused on how different feature subsets and reduced dimension-
ality change model performance, a limitation this research addresses when considering the
number of experiments. This research considered various experiments, i.e., different feature
subsets and data dimensions, to determine whether machine learning models perform better
than the benchmark model when forecasting the reaction of stock market returns to changing
economic activities in South Africa.

HTTPS://WWW.WITS.AC.ZA
https://www.wits.ac.za/science/
https://www.wits.ac.za/csam/


iv

Contents

Declaration of Authorship ii

Abstract iii

1 Introduction to the Research 1
1.1 Introduction to the Research . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Purpose Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 Research Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.6 Research Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.7 Research Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Literature Review 6
2.1 The Stock Market in South Africa . . . . . . . . . . . . . . . . . . . . . . . 7

2.1.1 The Financial Times Stock Exchange/Johannesburg Stock Exchange
All-Share Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2 The use of Asset Pricing Models in Forecasting the Reaction of Stock Market
Returns to Changing Economic Activities . . . . . . . . . . . . . . . . . . . 8
2.2.1 The Capital Asset Pricing Model . . . . . . . . . . . . . . . . . . . . 8
2.2.2 The Arbitrage Pricing Model . . . . . . . . . . . . . . . . . . . . . . 8
2.2.3 The Multi-Factor Model . . . . . . . . . . . . . . . . . . . . . . . . 9

The Estimation of Alpha and Systematic Risk Factors . . . . . . . . 9
The Estimation of Cumulative Systematic Risk Factors . . . . . . . . 9

2.2.4 The Modern Portfolio Model . . . . . . . . . . . . . . . . . . . . . . 10
2.2.5 The Drawbacks of Asset Pricing Models when Forecasting the Reac-

tion of Stock Market Returns to Changing Economic Activities . . . . 11
2.2.6 The Rationale for Comparing the Performance of Statistical and Ma-

chine Learning Models . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 The use of Conventional Statistical Models in Forecasting the Reaction of

Stock Market Returns to Changing Economic Activities . . . . . . . . . . . . 12
2.3.1 Statistical Univariate Forecasting Models . . . . . . . . . . . . . . . 12

The Autoregressive Integrated Moving Average Model . . . . . . . . 12
The Seasonal Autoregressive Integrated Moving Average Model . . . 13

2.3.2 The Statistical Multivariate Forecasting Model . . . . . . . . . . . . 13


v
The Vector Autoregressive Model . . . . . . . . . . . . . . . . . . . 13

2.3.3 The Drawbacks of Conventional Statistical Forecasting Models when
Forecasting the Reaction of Stock Market Returns to Changing Eco-
nomic Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.4 State of Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4.1 Related Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.5 Research Gaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.6 The use of Machine Learning Models in Forecasting the Reaction of Stock

Market Returns to Changing Economic Activities . . . . . . . . . . . . . . . 20
2.6.1 The Ordinary Least-Squares Regression Model . . . . . . . . . . . . 20
2.6.2 The Decision Tree Model . . . . . . . . . . . . . . . . . . . . . . . 21
2.6.3 The Random Forest Tree Model . . . . . . . . . . . . . . . . . . . . 21
2.6.4 The Extreme Gradient Boosting Tree Model . . . . . . . . . . . . . . 22

2.7 The use of Neural Networks in Forecasting the Reaction of Stock Market
Returns to Changing Economic Activities . . . . . . . . . . . . . . . . . . . 22
2.7.1 The Recurrent Neural Network . . . . . . . . . . . . . . . . . . . . . 23
2.7.2 The Gated Recurrent Unit Network . . . . . . . . . . . . . . . . . . 23
2.7.3 The Long-Short Term Memory . . . . . . . . . . . . . . . . . . . . . 24
2.7.4 The Restricted Boltzmann Machine . . . . . . . . . . . . . . . . . . 24
2.7.5 The Multi-Layer Perceptron . . . . . . . . . . . . . . . . . . . . . . 25

2.8 The Activation Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.8.1 The Rectified Linear Unit Activation Function . . . . . . . . . . . . 25

3 Research Methods 27
3.1 The Acquisition of Economic and Stock Market Data of South Africa . . . . 27

3.1.1 Sources of Economic and Stock Market Data of South Africa . . . . 28
3.1.2 Properties of Economic and Stock Market Data of South Africa . . . 28
3.1.3 Strategies used to Cleanse and Preprocess Data Before Forecasting . . 30

3.2 The Strategy used to Impute Missing Economic Data of South Africa . . . . 31
3.2.1 The k Nearest Neighbor Imputation Strategy . . . . . . . . . . . . . 31

3.3 The Strategy used to Replace Outliers in Economic and Stock Market Re-
turns Data of South Africa . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3.1 The Median Outlier Replacement Strategy . . . . . . . . . . . . . . . 32

3.4 The Strategy used to Reduce Economic Data Dimensions of South Africa . . 32
3.4.1 The Principal Components Analysis Method . . . . . . . . . . . . . 32

3.5 The Strategy used to Partition Economic and Stock Market Returns Data of
South Africa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.5.1 The Random Data Partitioning Strategy . . . . . . . . . . . . . . . . 33

3.6 The Strategy used to Scale Economic Data of South Africa . . . . . . . . . . 33
3.6.1 The Data Standardization Scaling Strategy . . . . . . . . . . . . . . 34

3.7 Experiments for Model Comparison . . . . . . . . . . . . . . . . . . . . . . 34
3.8 Strategies used to Regularize Regression Models . . . . . . . . . . . . . . . 36

3.8.1 The Ridge Model Regularization Strategy . . . . . . . . . . . . . . . 36
3.8.2 The Least Absolute Shrinkage and Selection Operator Model Regu-

larization Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.8.3 The Elastic Net Model Regularization Strategy . . . . . . . . . . . . 37

3.9 The Metric used to Evaluate the Performance of Models . . . . . . . . . . . 38
3.9.1 The Mean Absolute Percentage Error Metric . . . . . . . . . . . . . 38

3.10 Ethical Consideration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.11 Research Methods Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4 Experiment Results & Discussions 40


vi
4.1 The Exploration of the Distribution of Stock Market Returns Data in South

Africa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.1.1 The Granger-Causal Relationship between Economic Activities and

Stock Market Returns in South Africa . . . . . . . . . . . . . . . . . 41
4.2 The Benchmark for Model Comparison . . . . . . . . . . . . . . . . . . . . 42

4.2.1 The Performance of the Vector Autoregressive Model when Fore-
casting the Reaction of Stock Market Returns to Changing Economic
Activities in South Africa . . . . . . . . . . . . . . . . . . . . . . . 42

4.3 The Performance of Default Machine Learning Models when Forecasting the
Reaction of Stock Market Returns to Changing Economic Activities in South
Africa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.3.1 The Performance of Default Regression Models when Forecasting

the Reaction of Stock Market Returns to Changing Economic Activ-
ities in South Africa . . . . . . . . . . . . . . . . . . . . . . . . . . 43

4.3.2 The Performance of Default Tree-Based Models when Forecasting
the Reaction of Stock Market Returns to Changing Economic Activ-
ities in South Africa . . . . . . . . . . . . . . . . . . . . . . . . . . 43

4.3.3 The Performance of Default Neural Networks when Forecasting Re-
action of Stock Market Returns to Changing Economic Activities in
South Africa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.4 The Selection of a Feature Subset Containing Key Economic Features based
on the Gini Impurity Value Calculated by Tree-based Models . . . . . . . . . 46
4.4.1 The Selection of a Feature Subset Containing Key Economic Fea-

tures based on the Gini Impurity Value Calculated by the Decision
Tree Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.4.2 The Selection of a Feature Subset Containing Key Economic Fea-
tures based on the Gini Impurity Value Calculated by the Random
Forest Tree Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.4.3 The Selection of a Feature Subset Containing Key Economic Fea-
tures based on the Gini Impurity Value Calculated by the Extreme
Gradient Boosting Tree Model . . . . . . . . . . . . . . . . . . . . . 47

4.4.4 The Performance of Models Trained on a Feature Subset Contain-
ing Key Economic Features when Forecasting the Reaction of Stock
Market Returns to Changing Economic Activities in South Africa . . 48

4.5 The Reduction of Economic Data Dimensions . . . . . . . . . . . . . . . . . 49
4.5.1 Economic Features of South Africa in Different Dimensions . . . . . 49
4.5.2 An Index that Offers Insight into the Structure of the Economy in

South Africa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.5.3 The Performance of Models Trained on Low Economic Data Di-

mensions when Forecasting the Reaction of Stock Market Returns
to Changing Economic Activities in South Africa . . . . . . . . . . . 51

4.6 Overall Performance Summary . . . . . . . . . . . . . . . . . . . . . . . . . 52

5 Conclusions & Future Research 55
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.3 Research Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.5 Policy Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.6 Learned Model Application . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.7 Future Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58


vii
A A List of Economic Features used Forecast the Reaction of Stock Market Re-

turns in South Africa 59

B A Full Index that Offers Insight Into the Structure of the Economy in South
Africa 69

C Coefficients of the Optimal Model 73

Bibliography vii


viii

List of Figures

2.1 The reaction of stock market returns to changing economic activities, i.e.,
(A) economic activities, (B) international economic activities, (C) money and
banking activities, (D) capital markets activities, and (E) national government
finance activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3.1 The modeling pipeline exhibiting the workflow that produces models that
forecast the reaction of stock market returns to changing economic activities . 27

4.1 The distribution of stock market returns data in South Africa . . . . . . . . . 40
4.2 The loss function value across epochs of default neural networks when fore-

casting the reaction of stock market returns to changing economic activities
in South Africa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.3 The top ten economic features in a feature subset selected by the decision
tree model, as measured by the gini impurity value . . . . . . . . . . . . . . 46

4.4 The top ten economic features in a feature subset selected by the random
forest tree model, as measured by the gini impurity value . . . . . . . . . . . 47

4.5 The top ten economic features in a feature subset selected by the extreme
gradient boosting tree model, as measured by the gini impurity value . . . . . 47

4.6 Economic features of South Africa in two dimensions . . . . . . . . . . . . . 49
4.7 Economic features of South Africa in three dimensions . . . . . . . . . . . . 50
4.8 The top performance of each machine learning model over four periods . . . 52
4.9 The top performance of each tree-based feature selection strategy . . . . . . . 54
4.10 The learning curve of the optimal model that forecasts the reaction of stock

market returns to changing economic activities in South Africa . . . . . . . . 54


ix

List of Tables

1.1 Research structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.1 A summary of used economic features and their categories . . . . . . . . . . 7
2.2 The drawbacks of asset pricing models when forecasting the reaction of stock

market returns to changing economic activities . . . . . . . . . . . . . . . . 11
2.3 The drawbacks of conventional statistical forecasting models when forecast-

ing the reaction of stock market returns to changing economic activities . . . 14
2.4 Related research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.1 Sources of economic and stock market data of South Africa, along with the
description of data and the sample period . . . . . . . . . . . . . . . . . . . 28

3.2 The properties of economic and stock market data of South Africa, and com-
mentary on how those properties affect models that forecast the reaction of
stock market returns to changing economic activities, along with strategies
for addressing non-adherence to model requirements . . . . . . . . . . . . . 29

3.3 Strategies used to cleanse and preprocess data strategies before forecasting
the reaction of stock market returns to changing economic activities in South
Africa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.4 Experiments considered when determining whether machine learning models
perform better than the benchmark model when forecasting the reaction of
stock market returns to changing economic activities in South Africa . . . . . 34

3.5 Research methods summary . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.1 The descriptive statistics of stock market returns data in South Africa . . . . . 41
4.2 The Granger-Causality relationship between selected economic activities and

the stock market returns in South Africa . . . . . . . . . . . . . . . . . . . . 41
4.3 Selected economic features in the Granger-Causality Matrix . . . . . . . . . 42
4.4 The performance of the vector autoregressive model when forecasting the

reaction of stock market returns to changing economic activities in South
Africa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

4.5 The performance of default regression models when forecasting the reaction
of stock market returns to changing economic activities in South Africa over
four periods: 3 months (H1), 6 months (H2), 12 months (H3), and 24
months (H4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43


x
4.6 The performance of default tree-based models when forecasting the reaction

of stock market returns to changing economic activities in South Africa over
four periods: 3 months (H1), 6 months (H2), 12 months (H3), and 24
months (H4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.7 The performance of default neural networks when forecasting the reaction
of stock market returns to changing economic activities in South Africa over
four periods: 3 months (H1), 6 months (H2), 12 months (H3), and 24
months (H4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.8 The performance of models trained on a feature subset containing key eco-
nomic features when forecasting the reaction of stock market returns to chang-
ing economic activities in South Africa over four periods: 3 months (H1), 6
months (H2), 12 months (H3), and 24 months (H4) . . . . . . . . . . . . . 48

4.9 An index that provides insight into economic activities in South Africa . . . . 50
4.10 The performance of models trained on low economic data dimensions when

forecasting the reaction of stock market returns to changing economic activ-
ities in South Africa over four periods: 3 months (H1), 6 months (H2), 12
months (H3), and 24 months (H4) . . . . . . . . . . . . . . . . . . . . . . 51

4.11 The top ten highest-performing machine learning models when forecasting
the reaction of stock market returns to changing economic activities in South
Africa over four periods: 3 months (H1), 6 months (H2), 12 months (H3),
and 24 months (H4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4.12 The top ten worst-performing machine learning models when forecasting the
reaction of stock market returns to changing economic activities in South
Africa over four periods: 3 months (H1), 6 months (H2), 12 months (H3),
and 24 months (H4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.13 The ranking of economic features based on the gini impurity value calculated
by each tree-based model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

5.1 Research questions, various experiments, and research findings . . . . . . . . 56

A.1 A list of economic features used forecast the reaction of stock market returns
in South Africa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

B.1 A full index that offers insight into the structure of the economy in South Africa 69

C.1 Coefficients of the the optimal model (the ridge model) . . . . . . . . . . . . 73


1

1
Introduction to the Research

1.1 Introduction to the Research

The Johannesburg Stock Exchange (JSE) is legitimately regarded as the primary stock market
area in Africa, because it was the 17th largest stock exchange in the world in 2017 [SSEInitia-
tive 2021; JSELtd 2021] and the largest stock market exchange in Africa, with a total market
value of US$1.36 trillion and 339 listed stocks from various industries [SSEInitiative 2021].
Consequently, forecasting the returns of stocks listed on the JSE is important, but doing so
for all stocks remains challenging given the listed stock count.

Information about historical stock market index returns of South Africa is critical for eco-
nomic policymakers and investors, but it is insufficient for developing judgments, since it
lacks present predictions about the future of the stock market index. Hence, policymakers
rely on predictive analytical models to model the data creation process and create economic
growth predictions, before developing and implementing policies that accomplish strategic
economic objectives.

The principal objective of investors is to avert financial loss while maximizing predicted
asset returns. However, this objective may be unaccomplished due to economic risk consid-
erations. To attain this objective, investors focus heavily on mitigating unfavorable economic
risk factors and capitalizing on favorable economic risk variables. Policymakers also re-
quire substantial information to develop policies that eliminate systematic market mistakes,
conduct monetary and fiscal interventions for economic stability, and promote equity market-
friendly policies.

Previous research on the reaction of stock market returns to changing economic activities in
South Africa used conventional asset pricing models and statistical forecasting models that
contained methodological difficulties and practical issues, necessitating the use of sophisti-
cated analytical tools like machine learning models. Furthermore, there is little understanding
of how machine learning methods compare to traditional forecasting models in anticipating
the reaction of stock market returns in the modern economic setting.

This study looks at the dynamics of the economy and how stock market returns respond to
changes in economic activity in South Africa. It considers several experimental situations,
such as alternative feature sets, data dimensions, and model parameters, to test if machine
learning algorithms outperform the benchmark model (the vector autoregressive model) in
predicting stock market returns.


2
This research is valuable for investors in the development phase of investment strategy man-
agement, as well as policymakers who want to identify appropriate analytical tools for an-
swering critical practices and policy questions about economic activities and the reaction of
stock market returns in South Africa.

1.2 Problem Statement

Policymakers typically use statistical multivariate forecasting models to forecast the reaction
of stock market returns to changing economic activities, then use insights from models to
structure economic policies that reduce systematic market errors and benefit the stock market
[Olayode et al. 2021; Saxena and Bhadauriya 2021; Shamsudin et al. 2021].

Previous research on the reaction of stock market returns to changing economic activities
focused on leading economies like the U.S. economy [Bordo and Jeanne 2002; Stadtmann
and Dunsch 2018]. However, research focused on emerging economies has stalled and is cur-
rently focused on using conventional forecasting models [Macfarlane 2011; Tsepang Patrick
2013; Chifurira and Chinhamu 2019].

Prominent research used asset pricing models to understand the general equilibrium of ex-
change [Malamud 2015], some research used conventional statistical univariate forecasting
models [Pillay 2020; Chitenderu et al. 2014; Makatjane and Moroke 2021], and most re-
search repeatedly used a common statistical multivariate forecasting model, i.e., the vector
autoregressive model [Adejayan and Oke 2022], which is the benchmark model in this re-
search due to its widespread adoption.

Forecasting the reaction of stock market return to changing economic activities using statisti-
cal multivariate forecasting models remains a challenging undertaking due to inflexibility and
incompetence in modeling non-linear relationships [Pratiwi et al. 2021]. On the other hand,
Fu et al. [2018]; Wong et al. [2020]; Kamalov [2020], among others, suggested the adoption
of machine learning models, because these models can better handle data from non-linear
dynamic systems and yield outstanding model performance.

There is currently limited use of machine learning to forecast the reaction of stock market
returns in the emerging economic context [Ramos-Pérez et al. 2021; Klibanov et al. 2021;
Makatjane and Moroke 2021]. Furthermore, most research focused on whole features, few
models for comparison, and barely focused on how different feature subsets and reduced
dimensionality change model performance, a limitation this research addresses when consid-
ering the number of experiments.

This research established that machine learning models can be used to forecast the reaction
of stock market returns to changing economic activities in the emerging economic context.
It advances research by comparing various models trained on different feature subsets and
economic data dimensions.

1.3 Purpose Statement

This research investigated the dynamics of the economy and how stock market returns react
to changing economic activities in South Africa. It considered various experiments, i.e., dif-
ferent feature subsets and data dimensions, to determine whether machine learning models
perform better than the benchmark model (the vector autoregressive model) when forecasting
the reaction of stock market returns. Not only that, but it also produced an index that offers


3
insight into the structure of the economy in South Africa.

1.4 Research Questions

This research responded to four crucial research questions:

1. How do stock market returns react to changing economic activities in South Africa?

2. Do machine learning models perform better than the benchmark model (the vector au-
toregressive model) when forecasting the reaction of stock market returns to changing
economic activities in South Africa, as measured by the mean absolute percentage
error (MAPE) metric?

3. Stock market returns react differently to changing economic activities based on the
magnitude of change. Do models trained on a feature subset containing key economic
features selected based on the gini impurity value calculated by tree-based models
perform better than the models trained on whole features when forecasting the reaction
of stock market returns to changing economic activities in South Africa, as measured
by the MAPE metric?

4. Is the performance of models trained on low economic data dimensions (found by
the principal component analysis method) distinguishable from the performance of
models trained on high economic data dimensions when forecasting the reaction of
stock market returns to changing economic activities in South Africa, as measured by
the MAPE metric?

1.5 Research Contributions

This research makes the following research contributions:

• Determine whether machine learning models perform better than the benchmark model
when forecasting the reaction of stock market returns to changing economic activities
in South Africa across various experiments, as measured by the MAPE metric.

• Present key economic features in South Africa when forecasting the reaction of stock
market returns based on the gini impurity value calculated by tree-based models.

• Provide insight into the structure of the economy in South Africa by producing an index
that ranks economic activities based on the explained variance ratio value calculated
by the principal components analysis method.

• Show how different feature subsets and data dimensions change the MAPE value of
models when forecasting the reaction of stock market returns to changing economic
activities in South Africa.

1.6 Research Motivation

This study is significant because it analyzes how changes in economic activity affect stock
market returns in South Africa. It increases public awareness of the technical measurements
used in statistical and machine learning methodologies to forecast stock market index returns.
Furthermore, innovative analytical solutions for increasing the performance of stock market
index return prediction models.

This thesis contributes to the body of knowledge on stock market returns. Insights into the
economic causes of stock market returns may be valuable to policymakers, because they may


4
better understand how their policies impact the market. Practitioners may also gain insight
into the stock market’s behavioral and important tendencies for making investment judg-
ments.

1.7 Research Structure

Table 1.1 shows the research structure.

TABLE 1.1: Research structure

Number Chapter Heading Chapter Function
1 Introduction This chapter states the research problem (section 1.2),

the research purpose (section 1.3), research questions
(section 1.4), and research contributions (section 1.5).

2 Literature Review This chapter covers the stock market in South Africa
(section 2.1), highlights primary asset pricing models
for forecasting stock market returns (section 2.2) and
statistical univariate and multivariate forecasting mod-
els commonly used to forecast stock market returns
(subsection 2.3.1 and subsection 2.3.2), then presents
the drawbacks of using asset pricing models and statis-
tical forecasting models (subsection 2.2.5 and subsec-
tion 2.3.3).
This chapter equally refers to the state of literature and
related research (section 2.4), then reveals candidate
machine learning models for addressing the drawbacks
of using asset pricing models and statistical forecasting
models (section 2.6, and section 2.7)

3 Research Methods This chapter covers sources and properties of eco-
nomic and stock market data of South Africa (sub-
section 3.1.1 and subsection 3.1.2), then proceeds to
specify strategies for data imputation (section 3.2), out-
lier replacement (section 3.3), data dimension reduction
(section 3.4), data partitioning (section 3.5), data scal-
ing (section 3.6), and model regularization (section 3.8).
The chapter concludes by revealing the metric used
to evaluate the performance of candidate models (sec-
tion 3.9).

Continued on next page


5
Table 1.1 – Continued from previous page

Number Chapter Heading Chapter Function
4 Experiment

Results & Discus-
sions

This chapter shows exploratory descriptive statistical
results of stock market returns in South Africa (sec-
tion 4.1), produces an index that offers insight into
the structure of the economy in South Africa (subsec-
tion 4.5.2), reports the performance of the benchmark
model when forecasting the reaction of stock markets to
changing economic activities in South Africa, as mea-
sured by the MAPE value (section 4.2), then report the
performance of machine learning models when fore-
casting the reaction of stock markets to changing eco-
nomic activities in South Africa, as measured by the
MAPE value (section 4.3 and subsection 4.3.3).
The last segment of the chapter determines how differ-
ent feature subsets (subsection 4.4.4) and data dimen-
sions (subsection 4.5.3) change the MAPE value of can-
didate models.

5 Conclusions &
Future Work

This chapter recaps the reviewed literature (section 5.2),
covers the used research method (section 5.3), summa-
rizes experiment results (section 5.4), provides practice
and policy recommendations (section 5.5), details the
use of the learned model (section 5.6), and highlights
the road-map for forthcoming research (section 5.7).


6

2
Literature Review

This research adopts the conceptual framework for forecasting the reaction of stock market
returns to changing economic activities from previous research by Ahangar et al. [2010];
Jasra et al. [2012]; Ndikum [2020].

(A)

FIGURE 2.1: The reaction of stock market returns to changing economic
activities, i.e., (A) economic activities, (B) international economic activities,
(C) money and banking activities, (D) capital markets activities, and (E) na-

tional government finance activities

This research differs from previous research on the reaction of stock market returns to chang-
ing economic activities, in that it considers various economic features. The reaction of stock
market returns in South Africa is the target feature, and economic features are predictor fea-
tures (Table A.1 shows a thorough list of used economic features).


7
Table 2.1 shows a summary of used economic features and their categories.

TABLE 2.1: A summary of used economic features and their categories

Economic Feature Category
Indices: consumer price, producer price, domestic min-
ing and quarrying activities, domestic manufacturing,
etc.
Rates: interest, currency exchange, yields, bonds, secu-
rities, etc.
Services: electricity, fuels, gas, supplies, water, etc.
Goods: net trade and food, etc.

Economic category

Net average daily turnover, South African Reserve Bank
(SARB) gross reserves in foreign currency, gold and
other foreign reserves, etc.

International economic cat-
egory

Money supply, credit, deposits and advances, liabilities,
investment treasury bills, short-term credit, return on
equity, etc.

Money and banking cate-
gory

Traded shares value, fixed interest securities market,
non-resident transactions, equity derivative markets,
etc.

Capital markets category

National government revenue, expenditure, borrowing,
financing of net borrowing requirements, etc.

National government fi-
nance category

2.1 The Stock Market in South Africa

The Johannesburg Stock Exchange (JSE) is the official stock market exchange of South
Africa. It was the 17th largest stock exchange in the world in 2017, as measured by the
market capitalization [SSEInitiative 2021; JSELtd 2021]. It is also the largest exchange in
Africa, with a total market value of US$1.36 trillion and 339 listed stocks across diverse
sectors [SSEInitiative 2021].

It is reasonable to consider South Africa the primary stock market district in Africa. Fore-
casting returns of stocks listed in the JSE is valuable, but doing so for all stocks remains a
challenging task, provided the listed stock count.

This research focuses solely on the reaction of stock market index returns (Financial Times
Stock Exchange/Johannesburg Stock Exchange (FTSE/JSE) all-share index returns) to chang-
ing economic activities in South Africa.

2.1.1 The Financial Times Stock Exchange/Johannesburg Stock Exchange All-
Share Index

The FTSE/JSE all-share index represents the benchmark performance of stocks listed in the
JSE for 99.9% of the total market value. The benchmark performance is calculated using the
market capitalization-weighted index method. Equation 2.1 defines the market capitalization-
weighted index method:

MCPWI = w1 × p1 + w2 × p2, ..., wn × pn, (2.1)


8
where w denotes the weight share price and p denotes the share price. Equation 2.2 defines
the weight share prices:

wn =
MCi

n
∑
i=1

MCi

, (2.2)

where MCi denotes the total market capitalization.

2.2 The use of Asset Pricing Models in Forecasting the Reaction
of Stock Market Returns to Changing Economic Activities

The adoption of analytical tools for forecasting stock market returns and developing invest-
ment strategies is a long-standing phenomenon [Burdenko 2017; Ndikum 2020]. Research
on stock market returns in South Africa repeatedly used asset pricing models [Carter et al.
2017].

The capital asset pricing model values assets (subsection 2.2.6), then anticipates the future
cash flow of assets [Munk 2013]. This model focuses on the general equilibrium of exchange
in the market [Malamud 2015].

Alternatives to the capital asset pricing model, i.e., the arbitrage pricing model (subsec-
tion 2.2.2) and multi-factor model (subsection 2.2.6), consider systematic economic risk fac-
tors, while the modern portfolio model considers portfolio diversification (subsection 2.2.6)

2.2.1 The Capital Asset Pricing Model

The capital asset pricing model identifies the weight of the expected asset return, the asset
return as a function of the risk-free return and risk-premium, along with the discounted rate
of the net present value [Lintner 1965; Mossin 1966].

Reddy and Thomson [2011] used the capital asset pricing model to explain expected excess
stock market returns and determine the linkage between expected stock market returns and
the beta in South Africa.

2.2.2 The Arbitrage Pricing Model

The arbitrage pricing model uses the linear-oriented framework to identify the extent to which
systematic economic risk factors influence the expected asset return [Huberman 2005]. Equa-
tion 2.3 defines the arbitrage pricing model:

ER(x) = R f + β1Rp1 + β1Rp2 ·, βnRpn , (2.3)

where ER(x) denotes the expected asset return, R f denotes the risk-free asset return, βn
denotes the sensitivity of the asset price to fluctuations in systematic economic risk factors,
and Rpn denotes a risk-premium emanating from systematic economic risk factors.

Muzindutsi and Niyimbanira [2012] used the arbitrage pricing model to determine the ex-
posure of the returns of the top forty performing stocks listed on the JSE to a systematic


9
economic risk factor, the exchange rate.

2.2.3 The Multi-Factor Model

The multi-factor model uses the arbitrage pricing model as a baseline asset pricing model to
identify the extent to which expected asset returns react to systematic economic risk factors,
i.e., the inflation rate, interest rate, and economic cycle, among other systematic economic
risk factors [Fama and French 1993].

The model equally considers market uncertainty, along with individual and joint variability
among assets (or portfolios). Equation 2.4 defines the multi-factor model:

Ri = E (Ri) + βi1 + F1 + βi2 + F2+, ...,+βik + Fk + ϵi, (2.4)

where Ri denotes the asset return, ERi denotes the expected asset return, βi1 denotes the
sensitivity of the asset or portfolio return to fluctuations in systematic economic risk factors,
Fk denotes the systematic economic risk factor, and ϵi denotes market uncertainty.

Mukoyi and Ogujiuba [2022] compared the performance of various multi-factor models (i.e.,
the Fama and French three-factor model, Carhart four-factor model Fama and French five-
factor model) when forecasting the reaction of stock market returns in the resource sector,
industrial sector, and financial sector of South Africa to investment style risk.

The Estimation of Alpha and Systematic Risk Factors

Stock market research frequently considers the α (alpha) and β (beta) of the portfolio by
arranging a security characteristic line in the linear model (Equation 2.5).

Ri − R f = αiβi(RM − Ri) + ϵi, (2.5)

where Ri denotes the realized portfolio return, RM denotes the market return, R f denotes the
risk-free return, αi denotes the alpha of the portfolio, and βi denotes the beta of the portfolio.

The Estimation of Cumulative Systematic Risk Factors

The linear function considers systematic economic risk factors, idiosyncratic risk, and the
expected portfolio return, along with transaction cost. Equation 2.6 defines the linear func-
tion:

f (h) =
1
2

κhT
t QTQht +

1
2

κhT
t Sht − αTht + (ht−1)

TΛ(ht−1), (2.6)

where κ denotes a risk aversion factor.

The multi-factor model equally considers market neutrality, along with the position size and
diversification of the portfolio. Equation 2.7 defines the gradient (slope coefficient) in the
linear model:

f
′
(h) =

1
2

κ(2QTQh)
1
2

κ(2Sh)− α + 2(ht−1)Λ. (2.7)


10
Each common risk considers a systematic economic risk factor in the linear model. Equa-
tion 2.8 defines common risk:

Crisk =
1
2

hT
t βFββTht. (2.8)

The multi-factor model considers the portfolio return variance, diminishes the variance in
proximity to zero, and vector estimates of α.

Not only that, but the model equally considers the response of the asset price to the market
impact for each currency unit exchanged. Equation 2.9 transforms β to Q to diminish matrix
expansion:

Crisk =
1
2

hT
t QTQht. (2.9)

Equation 2.10 defines the linear impact model:

N

∑
i=1

= λ(i ,t)

(
h(i ,t) − h(i ,t−1)

)2
, (2.10)

where

λ(i ,t) =
1

10 × ADV(i ,t)
, (2.11)

where ADV(i ,t) denotes the average daily volume traded for each 10 basis point.

2.2.4 The Modern Portfolio Model

The modern portfolio model considers portfolio return volatility a risk proxy, along with its
expected return volatility and return weights. Not only that, but the model also confirms the
statistical dependence among assets and identifies the efficient frontier of the portfolio [Elton
and Gruber 2018]. Equation 2.12 estimates the expected portfolio return:

E(Rp) =
n

∑
i=1

wiE (Ri) , (2.12)

where Rp denotes the portfolio return, Ri denotes the expected asset return, and wi denotes
the weights of the asset return. Equation 2.13 defines the portfolio return variance:

σ2
p =

n

∑
i=1

w2
i σ2

i +
n

∑
i=1

n

∑
j ̸=i

wiwjσiσj pi j , (2.13)

where σi denotes the extent to which the asset return deviates from the mean asset return,
wi denotes the weights of the asset return, and pi j denotes the statistical dependence among
assets. Equation 2.14 defines the portfolio return volatility:

σp =
√

σ2
p . (2.14)


11
Equation 2.15 defines the expected portfolio return containing two assets (i.e., asset A and
B):

E(Rp) = wAE (RA)wBE(RB) = wAE (RA) + (1 − wA) E (RB) . (2.15)

Equation 2.16 defines the portfolio return variance:

σ2
p = w2

Aσ2
A + w2

Bσ2
B + 2wAwBσAσB pA B . (2.16)

Taljaard and Maré [2021] used the modern portfolio model to identify changes in the concen-
tration of market capitalization-weights in the top forty performing stocks listed on the JSE.

2.2.5 The Drawbacks of Asset Pricing Models when Forecasting the Reaction
of Stock Market Returns to Changing Economic Activities

This research acknowledges the importance of asset pricing models and their contribution to
our understanding of forecasting the reaction of stock market returns to changing economic
activities. However, these models have certain drawbacks (Table 2.2).

TABLE 2.2: The drawbacks of asset pricing models when forecasting the
reaction of stock market returns to changing economic activities

Model Drawback
The capital asset pricing model The capital asset pricing model is criticized

for its plainness and impracticality, along with
its failure to consider the statistical signifi-
cance between systematic economic risk fac-
tors and the expected asset return [Muthama
et al. 2014; Andrei et al. 2018].

The arbitrage pricing model Prevailing research on the reaction of stock
market returns to changing economic activities
hold that exploiting the expected asset return
should not remain the sole focus of investors.
Alternatively, they should equally consider the
asset return volatility or the portfolio return
volatility [Sinha 2016; Khudoykulov 2017].

The modern portfolio model The concern of investors about the down-
side risk, which represents the financial risk
of losses from investing in assets or a port-
folio, is unrealized by the modern portfolio
model [Otuteye and Siddiquee 2017; Crack
and Grieves 2017]. The model equally does
not consider a scenario, whereby the expected
portfolio return exceeds the actual portfolio re-
turn [Hou et al. 2017].


12
2.2.6 The Rationale for Comparing the Performance of Statistical and Ma-

chine Learning Models

The previous section discussed the most conventional models for forecasting stock market re-
turns, namely the capital asset pricing model (), the arbitrage pricing model (lsec:TheArbitragePricingModel),
the multi-factor model (), and the modern portfolio model (). Furthermore, the the inadequa-
cies of these models in stock market forecasting. Furthermore, the current portfolio model
was undiscussed in depth since this study does not focus on several South African stocks, but
rather on the stock market index, which includes all South African stocks.

In conclusion, the typical asset pricing methods outlined above are unable to manage the
complexity of the data set employed in this study. This dissertation examined the effective-
ness of statistical models and machine learning models in forecasting the impact of economic
activity on stock market returns.

2.3 The use of Conventional Statistical Models in Forecasting the
Reaction of Stock Market Returns to Changing Economic
Activities

Previous research on the reaction of stock markets in South Africa used conventional statisti-
cal forecasting models. For instance, Pillay [2020]; Chitenderu et al. [2014]; Makatjane and
Moroke [2021] used statistical univariate forecasting model, and Aye et al. [2020]; Ilesanmi
and Tewari [2020]; Adejayan and Oke [2022] used statistical multivariate models.

2.3.1 Statistical Univariate Forecasting Models

Statistical univariate forecasting models identify predictive patterns of a temporal feature and
forecast successive patterns [Babu and Reddy 2014]. These forecasting models spearhead the
frontier of research on stock market returns in South Africa [Pillay 2020; Chitenderu et al.
2014; Makatjane and Moroke 2021].

This research covers two conventional statistical forecasting models, i.e., the autoregressive
integrated moving average model (section 2.3.1) and seasonal autoregressive integrated mov-
ing average model (section 2.3.1), to provide the background of statistical univariate forecast-
ing.

The Autoregressive Integrated Moving Average Model

The autoregressive integrated moving average (ARIMA) (p, d, q) (or Box-Jenkins) model
bundles p—the lag k, d of the temporal feature—property changes of the temporal feature,
and q—the moving average order, to identify predictive patterns and forecast a temporal
feature [Young and Shellswell 1972]. Equation 2.17 defines the ARIMA (p, d, q) model:

ŷt − yt−1 = µ̂ + ϕ̂(yt−1 − yt−2), ...,+ϵ̂t, (2.17)

where ŷt denotes estimates of the temporal feature at period t, µ denotes unbiased estimates
of the temporal feature, and ϵ̂t denotes residuals of the (ARIMA) (p, d, q) model.

Mallikarjuna and Rao [2019] used the ARIMA (p, d, q) model, along with other models
like the self-exciting threshold autoregressive model, recurrent neural network, and a hybrid


13
model of the ARIMA (p, d, q) model and recurrent neural network, to forecast stock market
returns in South Africa.

The Seasonal Autoregressive Integrated Moving Average Model

The seasonal autoregressive integrated moving average (SARIMA) (p, d, q)× (P, D, Q, s)
model amplifies the ARIMA (p, d, q) model with additional parameters (i.e., p and seasonal
P, d and seasonal D, q and seasonal Q) [Hyndman and Athanasopoulos 2013]. Equation 2.18
defines the (SARIMA) (p, d, q)× (P, D, Q, s) model:

ŷs
t = yt − yt−4 + ϵ̂i. (2.18)

Equation 2.19 translates Equation 2.18:

yS
t = (1 − B4)× yt. (2.19)

Makatjane and Moroke [2021] used the (SARIMA) (p, d, q)× (P, D, Q, s) model to fore-
cast stock market returns in South Africa.

2.3.2 The Statistical Multivariate Forecasting Model

Statistical multivariate forecasting models identify predictive patterns of temporal features
and forecast successive patterns while considering model residuals. Research on the reac-
tion of stock market returns to changing economic activities in South Africa repeatedly used
a popular conventional statistical multivariate forecasting model, the vector autoregressive
model [Aye et al. 2020; Ilesanmi and Tewari 2020; Adejayan and Oke 2022].

This research considers the vector autoregressive model, the benchmark model, because of
its widespread adoption in research.

The Vector Autoregressive Model

The vector autoregressive model maintains a stochastic system that interprets multiple fea-
tures as linear p lag combinations, along with p lags [Gouriéroux et al. 2017]. For clarifica-
tion, assume a temporal problem with two features, Equation 2.20 and Equation 2.21 defines
the vector autoregressive model:

ŷt = α + β̂11yt−1 + β̂12yt−2 + γ̂11x12 + γ̂12xt−2 + ϵ̂1t, (2.20)

xt = α + β̂21yt−1 + β̂22yt−2 + γ̂21x22 + γ̂22xt−2 + ϵ̂2t, (2.21)

where ŷt denotes k× 1 temporal feature estimates, α̂ denotes n× 1 intercept vector estimates,
γ̂i denotes k × k coefficient matrix estimates, and ϵ̂t denotes a serial uncorrelated random
vectors order with x̄t = 0 and sum joint variability among temporal features.

Pillay [2020] and Macfarlane [2011] used the vector autoregressive model to forecast the
reaction of stock market returns to changing economic activities in South Africa.


14
The Structural Vector Autoregressive Model

The structural vector autoregressive model bundles Equation 2.20 and Equation 2.21, then
identifies causality among temporal features and concludes on the statistical significance
among them [Gouriéroux et al. 2017; Tank et al. 2021]. Equation 2.22 defines the struc-
tural vector autoregressive model:

ŷt = α̂1 + β̂1xt + ϕ̂11yt−1 + ϕ̂12yt−2 + ϕ̂11x12 + ϕ̂12xt−2 + v̂1t, (2.22)

xt = α2 + β̂1xt + ϕ̂21yt−1 + ϕ̂22yt−2 + ϕ̂21x22 + ϕ̂22yt−2 + v̂2t. (2.23)

2.3.3 The Drawbacks of Conventional Statistical Forecasting Models when Fore-
casting the Reaction of Stock Market Returns to Changing Economic
Activities

Conventional statistical forecasting models are universally regarded in previous research on
the reaction of stock market returns to changing economic activities in South Africa, but they
have drawbacks (Table 2.3).

TABLE 2.3: The drawbacks of conventional statistical forecasting models
when forecasting the reaction of stock market returns to changing economic

activities

Drawback Description
Non-stationarity Temporal features rarely follow a stationary

process, and the model residual expansion
problem is common in analysis, resulting in
non-adherence to some requirements of con-
ventional statistical forecasting models.

Non-linearity Because temporal features are frequently
non-linear, conventional statistical forecasting
models are unsuitable for modeling complex
data structures, i.e., when the data set contains
multiple temporal features with high dimen-
sions [Liu et al. 2021; Pahlawan et al. 2021a].

The curse of data dimensionality
problem

As specified, conventional statistical forecast-
ing models are incapable of handling a data
set containing multiple temporal features with
high dimensions [Ahangar et al. 2010].

Subpar model performance Due to non-adherence of temporal features to
conventional statistical forecasting model re-
quirements, the model performance tends to
be subpar [Zimmerman 1994; Anscombe and
Guttman 1960; Pincus 1995].

The machine learning approach addresses the drawbacks of conventional statistical forecast-
ing models [BenSaïda and Litimi 2013; Liu et al. 2020; Kennedy et al. 2020].


15

2.4 State of Literature

Policymakers became concerned about the instability in the stock market during and after the global market crisis started in 2008, particularly those in emerging
economies [Hedging and Umoetok 2013]. Such occurrences amplified the need for sophisticated analytical tools like machine learning models (section 2.6), because
these models can capture complex market data structures [Ramos-Pérez et al. 2021; Liu et al. 2021; Pahlawan et al. 2021a].

2.4.1 Related Research

This research reviews previous research, then identifies related research debates, along with research inconsistencies and gaps it intends to fill, before selecting
candidate machine learning models and the research method to use.

Table 2.4 shows the model criteria guiding this research, along with feature sets, model specifications, and the performance of models that forecast the reaction of
stock market returns to changing economic activities.

TABLE 2.4: Related research

Author Features Model Specifications Model Performance
Ndikum [2020] 200 economic features across multi-

ple categories (i.e., the economic cat-
egory, and money and banking cat-
egory, among other economic cate-
gories).

Compared to the performance of the
restricted Boltzmann machine and
the deep belief network when fore-
casting the reaction of S&P 500 re-
turns to 200 economic features, as
measured by the mean squared error
(MSE).

The restricted Boltzmann machine achieved a mean
squared error (MS)E value of 0.36. Whereas, the deep
belief network achieved a MSE value of 0.35.

Continued on next page


16Table 2.4 – Continued from previous page
Author Features Model Specifications Model Performance

Ahangar et al.
[2010]

Data comprised 40 microeconomic
features and economic features
across multiple categories (i.e.,
the rates category, and the money
and banking category, among other
economic categories).

7 economic features were selected
based on the explained variance ratio
value found by the principal compo-
nent analysis method. The deep be-
lief network (with three hidden lay-
ers and 14 neurons) was used to fore-
cast the reaction of stock market re-
turns to changing economic activities
in Iran.

The MSE value for the deep belief network was 31.6.

Pahlawan et al.
[2021a]

Data included 20 economic features
spread over multiple economic cat-
egories (i.e., the rates category, in-
dices category, and money and bank-
ing category, among other economic
categories).

Compared the performance of the re-
current neural network, gated recur-
rent unit, and long-short term mem-
ory when forecasting the reaction of
stock market returns (particularly the
S&P 500 returns) to changing eco-
nomic activities in the U.S., as mea-
sured by the MSE.

The MSE value for the long-short term memory was
1.35, the MSE value for the gated recurrent unit was
1.55, and the MSE value for the recurrent neural net-
work was 1.55.

Wong et al. [2020] Data comprises 74 industry-specific
features and 102 economic features.

The deep belief network (with an on-
line early stopping strategy) was used
to the reaction of stock market re-
turns (particularly the S&P 500 re-
turns) to changing industry-specific
activities and economic activities in
the U.S.

The MSE value for the deep belief network was 50.22.

Continued on next page


17

Table 2.4 – Continued from previous page
Author Features Model Specifications Model Performance

Kamalov [2020] Data comprises the S&P 500 price
volatility simulation data.

Compared the performance of the
long-short term memory, multi-layer
perceptron, and convolutional neural
network when classifying the reac-
tion of stock market return volatil-
ity (particularly the S&P 500 return
volatility) to changing economic ac-
tivities in the U.S.

The long-short term memory outperformed the multi-
layer perceptron and convolutional neural network, with
a 0.85 area under the curve value.

Xiong et al.
[2015]

Data contained multiple broad eco-
nomic features.

The Shapley additive method was
used to select features. The long-
short term memory was used to fore-
cast the reaction of stock market re-
turns to a feature subset and whole
economic features.

The first long-short term memory achieved a MSE value
of 2 890, and the second long-short term memory
achieved a MSE value of 2 880.

Klibanov et al.
[2021]

Data comprised Russell index returns
simulation data.

The deep belief network was used to
classify Russell index returns.

The accuracy value for the deep belief network was
55.42.

Ramos-Pérez et
al. [2021]

Data comprised a few economic fea-
tures.

Compared the performance of the
long-short term memory and multi-
layer perceptron when forecasting
the reaction of stock market return
volatility (particularly S&P 500 re-
turn volatility) to changing economic
activities in the U.S., as measured by
the root mean squared error (RMSE).

The RMSE value for the multi-layer perceptron, which
was astonishingly near zero, was the highest.

Continued on next page


18Table 2.4 – Continued from previous page
Author Features Model Specifications Model Performance

Fu et al. [2018] Data comprised 244 technical and
fundamental features.

Compared the performance of deep
belief networks (with different struc-
tures) when forecasting the reaction
of stock market returns (particularly
S&P 500 returns) to changing techni-
cal features and economic activities
in the U.S., as measured by the mean
absolute error (MAE).

The MAE values for deep belief networks were 2.98 and
0.97, respectively.

Alhomadi [2021] Data comprised 30 macroeconomic
features and U.S. stock market in-
dices features.

Compared the performance of the
ordinary least-squares regression
model, elastic net model, support
vector regression model, random
forest model, and extreme gradient
boosting model, when forecasting
the reaction of stock market returns
(particularly the S&P 500 returns) to
changing economic activities in the
U.S., as measured by the R-Squared.

The R-Squared value of the ordinary least-squares re-
gression model was 0.1967, the R-Squared value of the
elastic net model was 0.4559, the R-Squared value of
the support vector regression model was 0.4970, the R-
Squared value of the random forest model was 0.3363,
and the R-Squared value of the extreme gradient boost-
ing model was 0.4215.

Nengovhela
[2022]

Data comprised economic features
and stock market indices features.

Compared to the performance of
the random forest model, k nearest
neighbor model, support vector re-
gression model, decision tree model,
and neural network, when forecast-
ing the reaction of stock market re-
turns to changing economic activities
in South Africa.

The MAE value of the random forest model was 0.9609,
the MAE value of the k nearest neighbor model was
0.9817, the MAE value of the support vector regression
model was 1.1231, the MAE value of the decision tree
model was 1.3247, and the MAE value of the neural net-
work was 0.9819.

Continued on next page


19

Table 2.4 – Continued from previous page
Author Features Model Specifications Model Performance

Mallikarjuna and
Rao [2019]

Data comprised stock market indices
features of developed, emerging, and
frontier economies.

Compared the performance of dif-
ferent models (i.e., the ARIMA
(p, d, q) model, self-exciting thresh-
old autoregressive model, recurrent
neural network, singular spectrum
analysis model, and a hybrid model
of the ARIMA (p, d, q) model and
recurrent neural network) when fore-
casting stock market returns in South
Africa, as measured by the RMSE.

The RMSE of theARIMA (3, 0, 1) model was
1.062336, the RMSE of the self-exciting threshold au-
toregressive model was 1.064449, the RMSE of the re-
current neural network was 1.063028, the RMSE of the
singular spectrum analysis model was 1.066819, and
the RMSE of the hybrid model was 1.061791.

Table 2.4 shows most research focused on whole features, few models for comparison, and barely focused on how different feature subsets and reduced dimension-
ality change model performance, a limitation this research addresses when considering the number of experiments.

This research established that machine learning models can be used to forecast the reaction of stock market returns to changing economic activities in the emerging
economic context. It advances research by comparing various models with different feature subsets and data dimensions.


20
2.5 Research Gaps

While more preliminary research has examined how economic variables impact stock market
returns in South Africa, only a few studies have looked into the performance differences
between statistical models and machine learning models. Those that did so concentrated
on sophisticated economies, employed restricted experimental scenarios, and made out-of-
sample forecasts over a single time horizon.

This study, on the other hand, focuses on comparing the effectiveness of statistical models
and machine learning models in projecting the reaction of stock market returns to changing
economic activity in the developing economic scenario. This study is unusual in that it tested
several models trained on distinct feature subsets and economic data components over varied
time periods.

2.6 The use of Machine Learning Models in Forecasting the Re-
action of Stock Market Returns to Changing Economic Ac-
tivities

The machine learning approach denotes a practical systematic approach that maps out logi-
cal steps for sophisticated computer systems to learn tasks T based on experience [Liu 1996;
Mitchell 1997; Kennedy et al. 2020]. This approach advances on task T using knowledge
of prior performance P estimates, identifies complex data structures, and enhances the gen-
eralization capacity [Dietterich 1996], which is useful for forecasting the reaction of stock
market returns to changing economic activities.

This research used the ordinary least-squares regression model, decision tree model, random
forest tree model, extreme gradient boosting tree model, recurrent neural network, gated
recurrent unit network, long-short term memory, restricted Boltzmann machine, and multi-
layer perceptron.

2.6.1 The Ordinary Least-Squares Regression Model

The ordinary least-squares regression model learns xi (predictor features) and predicts yi (a
target feature) while deflating ϵ̂i (ordinary least-squares regression model residuals). Equa-
tion 2.24 defines the ordinary least-squares regression model:

ŷi = β̂0 + β̂1x1 + ϵ̂i, (2.24)

where ŷi denotes predicted yi estimates, β̂0 (the y-intercept) denotes ȳi, where xi = 0,
and β̂i (a slope coefficient) denotes the path of corresponding changes between xi and ŷi.
Equation 2.25 defines β̂0:

β̂1 =

n
∑
i=1

(xi − x̄) (yi − ȳi)

n
∑
i=1

(xi − x̄i)
2

, (2.25)

where x̄i denotes the mean value of xi and ȳi denotes the mean value of yi.

Equation 2.26 defines ϵ̂i:


21

ϵ̂i = yi − ŷi. (2.26)

Equation 2.24 with more xi results in Equation 2.27:

ŷi = β̂0 + β̂1x1 + β̂2x2 + β̂3x3, ...,+ϵ̂i. (2.27)

[Marozva 2020] used the ordinary least-squares regression model to forecast the reaction of
stock market return volatility to changing economic activities and political activities in South
Africa. Equally, Mpofu [2011] used the model to forecast the reaction of stock market return
to the manufacturing index and prime overdraft rate, among other features.

2.6.2 The Decision Tree Model

The decision tree model adopts a recursive partitioning strategy to isolate feature values.
Successively, the model reduces the impurity using the gini impurity estimator and splits
nodes while inflating the entropy (Equation 2.29) [Moore II 1987]. Equation 2.28 defines
the gini impurity estimator:

f̂ (xi) = 1 −
c

∑
i=1

p̂2
j , (2.28)

where pj denotes class c of the node in the sample proportion. By dividing feature values into
manageable chunks, the entropy estimator isolates homogeneous feature values of nodes and
then identifies irregularities. Equation 2.29 defines the entropy estimator:

f̂ (xi) = −
c

∑
i=1

(
log2 p̂j

)
. (2.29)

For Equation 2.29, pj ̸= 0, provided class c is unfilled. Entropy = 0 for feature values is
similar to the class c of the node in the sample proportion.

Nengovhela [2022] used decision tree models to forecast the reaction of stock market returns
to changing economic activities in South Africa.

2.6.3 The Random Forest Tree Model

The random forest tree model unifies decision tree models produced through a random pro-
cess to enhance model performance by using the loss minimization approach at multiple
iterations [Vijayakumar and Cheung 2018]. Equation 2.30 defines the random forest tree
model:

ŷi =
1
N

n

∑
i=1

f̂
(

x
′
i

)
, (2.30)

where f̂
(

x
′
i

)
denotes the function (a linear function for this research).


22
Nengovhela [2022] used random forest tree models to forecast the reaction of stock market
returns to changing economic activities in South Africa.

2.6.4 The Extreme Gradient Boosting Tree Model

The extreme gradient boosting tree model bundles hollow decision tree models to enhance
model performance by using the loss minimization approach at multiple iterations [Hastie et
al. 2009; Nokeri 2021]. This involves evaluating the performance of decision tree models and
referring to the previous subpar model performance, while considering decision tree model
residuals from previous iterations [Mason et al. 1999]. Equation 2.31 learns xi:

ŷi = m̂i (xi) + ϵ̂1. (2.31)

Equation 2.32 investigates the ϵ̂i dependency:

ϵ̂3 = ĥi (xi) + ϵ̂3. (2.32)

Equation 2.33 bundles the regressed ϵ̂i in Equation 2.32:

ŷi = m̂i (xi) + ĝi (xi) + ĥi (xi) + ϵ̂3. (2.33)

Equation 2.34 completes ŷi:

ŷi = α × m̂i (xi) + β̂i × ĝi (xi) + γ̂ × ĥi (xi) + ϵ̂4. (2.34)

Alhomadi [2021] used the extreme gradient boosting tree model to forecast the reaction of
stock market returns to changing economic activities in the U.S. No research used the model
to forecast the reaction of stock market returns in South Africa.

2.7 The use of Neural Networks in Forecasting the Reaction of
Stock Market Returns to Changing Economic Activities

Neural networks are distinct machine learning model classes that are a replica of animals’
biological neural networks. A neural network accumulates xi (predictor features) in the input
layer (the first layer in a neural network), then use f̂ (xi) (an activation function) to identify
complex data structures and route feature values to successive hidden layers (layers between
the input layer and the output layer in a neural network), designate dissimilar wi (weights)
and βi (biases), then use f̂ (xi) in the output layer (the last layer in a neural network) to
forecast subsequent yi (predicted target feature estimates).

This research considers a subset of machine learning models, acknowledged as neural net-
works, i.e., the recurrent neural network (subsection 2.7.1), gated recurrent unit (subsec-
tion 2.7.2), long-short term memory (subsection 2.7.3), restricted Boltzmann machine (sub-
section 2.7.4), and multi-layer perceptron (subsection 2.7.5).


23
2.7.1 The Recurrent Neural Network

The recurrent neural network accumulates xt, then uses f̂ (xt) to identify xt and ĥt−1 (pre-
vious hidden states), designate distinctive wt, and forecast yt (target feature values) and
successive ĥt using the tangenthyperbolic activation function. Equation 2.35 defines the
recurrent neural network:

ĥt = tanh(wh × ht−1 × wx × xt), (2.35)

where ĥt denotes hidden states, wh denotes weights of ht−1 , wx denotes weights of xt, and
tanh denotes a tangent hyperbolic activation function that predicts yt restricted to [-1, 1].
Equation 2.36 predicts yt:

ŷt = why − ĥt. (2.36)

Sako et al. [2022] used the recurrent neural network to forecast the reaction of stock market
returns to the exchange rate (ZAR/USD) in South Africa.

Mallikarjuna and Rao [2019] compared the performance of different models (i.e., the ARIMA
(p, d, q) model, self-exciting threshold autoregressive model, recurrent neural network, sin-
gular spectrum analysis model, and a hybrid model of the ARIMA (p, d, q) model and re-
current neural network) when forecasting stock market returns in South Africa, among other
countries, as measured by the root mean squared error. The research found that all candidate
models outperform the recurrent neural network.

2.7.2 The Gated Recurrent Unit Network

The gated recurrent unit uses f̂ (xt) to learn xt and ht−1 , and contains a forget gate that
f orgets invaluable xt and ht−1 . The neural network equally contains a reset gate (or an
update gate) [Chung et al. 2014], whereby t and h = 0:

zt = σi (wz[ht−1 , xt]) , (2.37)

rt = σi (wr[ht−1 , xt]) , (2.38)

ĥt = tanh (wr[rtht−1 , xt]) , (2.39)

ĥt = (1 − zt)×
(

wo[ht−1 + zt + ĥt]
)

, (2.40)

where zt denotes an update gate, rt denotes a reset gate, ĥt denotes hidden states, xt denotes
predictor features, tanh denotes a tangent hyperbolic activation function, and wt denotes the
weight matrix.

Sako et al. [2022] compared the performance of various neural networks (i.e., the gated re-
current unit, among other neural networks like the recurrent neural network and long-short
term memory) when forecasting the reaction of stock market returns to the exchange rate, the


24
South African rand to the U.S. dollar.

2.7.3 The Long-Short Term Memory

The long-short term memory accumulates xt and uses f̂ (xt) in an update gate to conclude by
evoking some xt and ht−1 , before routing the rest to cell state vectors of ht−1 that use f̂ (xt)
to forecast yt [Hochreiter and Schmidhuber 1997].

f̂ (x)t = σ(W f [ht−1 , xt] + β̂ f ), (2.41)

it = σ(wi[ht−1 , xt] + β̂i), (2.42)

x̃t = tanh(wi[ht−1 , xt] + β̂C), (2.43)

c̃t = ft × Ct−1 + it + ct, (2.44)

ot = σ(wo[ht−1 , xt] + β̂0), (2.45)

ĥt = ot × tanh(ct), (2.46)

where xt denotes predictor features, ft denotes the activation vector of the forget gate, ĥt
denotes predicted hidden states, it denotes the activation vector of input gate or update gate,
x̃t denote cell state vectors of c̃t, and wt denotes a weight matrix, and β̂C denotes the bias of
xt in cell state vectors.

Balusik et al. [2021] compared the performance of the long-short term memory to the per-
formance of the (SARIMA) (p, d, q)× (P, D, Q, s) model when forecasting stock market
returns in South Africa, as measured by the RMSE metric. The research found the long-short
term memory outperforms the (SARIMA) (p, d, q)× (P, D, Q, s) model.

2.7.4 The Restricted Boltzmann Machine

The restricted Boltzmann machine denotes an abstract neural network with v̂i—a visible
layer and ĥj—a hidden layer attached to wi , j . Equation 2.47 defines the energy function of
v̂i and ĥj:

E (vi, hi) =
n

∑
i=1

αivi −
n

∑
i=1

β̂ jhj −
n

∑
i=1

n

∑
j=1

v̂iwi , j ĥj, (2.47)

where αi denotes weights and biases of v̂i, and bj denotes biases of v̂j. Equation 2.48 trans-
lates Equation 2.47.

E
(

v̂i, ĥi

)
= −αTvi − β̂ j

T
ĥj − v̂i

Twi , j ĥj. (2.48)


25
da Costa and Gebbie [2020] used the restricted Boltzmann machine stacked with auto-encoders
to forecast the stock market price in South Africa.

2.7.5 The Multi-Layer Perceptron

The multi-layer perceptron maintains an input layer and output layer, along with two lay-
ers between them (hidden layers) at most. This feed-forward network operates the back-
propagation learning approach to estimate the gradient of the loss function, along with its
weights. Equation 2.49 accumulates and converts xi and designates dissimilar wj i and β̂ j,
then routes them to an initial hidden layer:

f̂ (xi) = β̂ j +
n

∑
i=1

wi j xi. (2.49)

Equation 2.50 uses φm in an initial hidden layer, which incrementally routes feature values
to successive hidden layers:

φm = [1 + exp ( f (xi))]
−1. (2.50)

Equation 2.51 uses an activation function to forecast yi, which are treated as xi by successive
hidden layers:

ŷi = β̂ j +
n

∑
j=1

wj φj. (2.51)

Ataman and Kahraman [2021] compared the performance of the ordinary least-squares model,
multi-layer perceptron, and a hybrid model (integrating the ordinary least-squares model and
multi-layer perceptron) when forecasting the reaction of stock market returns to changing
economic activities in Brazil, Russia, India, China, and South Africa, as measured by the R-
Squared metric. The research found the hybrid model outperforms the ordinary least-squares
model and multi-layer perceptron.

2.8 The Activation Function

To forecast yi (target feature values), f̂ (xi) (an activation function) first accumulates xi and
identifies complex data structures, then predicts subsequent yi. There are multiple activation
functions (i.e., the sigmoid activation function, tangent hyperbolic activation function, and
rectified linear unit activation function, among other activation functions).

2.8.1 The Rectified Linear Unit Activation Function

This research uses a rectified linear unit (relu) activation function at layers of neural networks,
because it does not bound ŷi, since it produces ŷi that range from 0 to ∞ (infinity) after
accumulating xi and identifying their predictive patterns. Equation 2.52 defines the relu
activation function:

Relu = max(0, xi). (2.52)


26
The relu activation function acquires feature values (and/or hidden states) from the input
layer and absorbs them, then routes them to concurrent hidden layers, which learns predictive
patterns and attaches varying biases and weights.

In the last hidden layer, the relu activation function directs predictor feature values (and/or
hidden state) to the output layer, which predicts concealed target feature values (and/or con-
currenthidden states).


27

3
Research Methods

This research adopts a research method from previous research. Figure 3.1 shows the model-
ing pipeline exhibiting the workflow that produces models that forecast the reaction of stock
market returns to changing economic activities.

FIGURE 3.1: The modeling pipeline exhibiting the workflow that produces
models that forecast the reaction of stock market returns to changing eco-

nomic activities

3.1 The Acquisition of Economic and Stock Market Data of South
Africa

This research extracts 54 monthly economic features from the South African Reserve Bank
(SARB) database, and 99 monthly economic features from the Federal Reserve Economic
Data (FRED) database, because those were available monthly features in the specified databases.


28
It extracts the monthly FTSE/JSE all-share index price feature from the JSE database. The
sample period was selected because the Financial Times Stock Exchange/Johannesburg Stock
Exchange (FTSE/JSE) all-share index was established in 2002 by the FTSE and JSE. For
some unexplained reason, the majority of the factors addressed in this research began to be
covered around 2002. The sample period will now begin in 2002.

3.1.1 Sources of Economic and Stock Market Data of South Africa

Table 3.1 shows sources of economic and stock market data of South Africa, along with the
description of data and the sample period.

TABLE 3.1: Sources of economic and stock market data of South Africa,
along with the description of data and the sample period

Source Description Features Sample
Period

FRED A regional reserve bank that is
part of the United States Cen-
tral Bank with headquarters in
Washington, D.C. is known as
the Federal Reserve Bank of St.
Louis 1.

99 monthly changing economic
features (predictor features)
sourced from the FRED
database.

2002 -
2022

SARB SARB is the Central Bank of
South Africa 2.

54 monthly changing economic
features (predictor features)
sourced from the SARB
database

2002 -
2022

JSE JSE is the official stock ex-
change in South Africa.

A single monthly temporal fea-
ture, the FTSE/JSE all-share in-
dex price (the target feature)
sourced from the SARB 3.

2002 -
2022

The Fred database combines data from many sources, including Statistics South Africa (Stats
SA), the Organization for Economic Co-operation and Development (OCED), and the South
African Reserve Bank (SARB). Some variables are unavailable in the SARB database. As
a result, the FRED database was utilized as a supplementary data source to collect South
African economic information.

3.1.2 Properties of Economic and Stock Market Data of South Africa

Table 3.2 shows properties of economic and stock market data of South Africa, and commen-
tary on how those properties affect models that forecast the reaction of stock market returns

1The Federal Reserve Bank of St. Louis periodically maintains economic data of many nations, along with
changing economic activities, in a database, acknowledged as FRED.

2The principal SARB mandate is to structure and influence South African economic expansionary policies,
and produce banknotes and coins, among other mandates.

3The FTSE/JSE all-share index represents the benchmark performance of equities in South Africa for 99.9%
of the total market capitalization.


29
to changing economic activities, along with strategies for addressing non-adherence to model
requirements. The purpose of identifying the properties was to identify which research tech-
niques to use.

TABLE 3.2: The properties of economic and stock market data of South
Africa, and commentary on how those properties affect models that forecast
the reaction of stock market returns to changing economic activities, along

with strategies for addressing non-adherence to model requirements

Data Properties Ve
ct

or
A

ut
or

eg
re

ss
iv

e

L
in

ea
r

R
eg

re
ss

io
n

Tr
ee

-b
as

ed

N
eu

ra
lN

et
w

or
k

Comments
Non-stationarity *** *** Certain economic features are not stationary.

To adhere to the vector autoregressive model
stationary requirement. This research used an
order differentiation strategy to produce eco-
nomic features that proceed linearly and pre-
vent the model residual expansion problem.

Few economic feature
values

*** *** * *** Economic features contain a few feature val-
ues, which makes it challenging to identify
complex predictive patterns using neural net-
works, as neural networks are intended for
large data sets.

Missing economic fea-
ture values

* While neural networks and the decision tree
model do not hold rigid model requirements
regarding missing values, the vector autore-
gressive model and ordinary least-squares re-
gression model hold rigid requirements. The k
nearest neighbor imputation strategy imputes
missing economic feature values.

High dimensions of
economic activities

* *** ** *** The vector autoregressive and ordinary-least
squares regression model cannot identify com-
plex structures in economic and stock mar-
ket data and generalize unknown observations
when there are high dimensions of economic
activities. The principal components analy-
sis method reduces the dimensions of eco-
nomic activities of South Africa into meaning-
ful eigenvectors.

Different scales and
quantities of economic
features

*** *** *** *** Economic features come in different sizes and
quantities. The data standardization scaling
strategy scales changing economic activities
before modeling.

Legend


30

* Poor data modeling capacity.
** Regular data modeling capacity.
*** Optimal data modeling capacity.

3.1.3 Strategies used to Cleanse and Preprocess Data Before Forecasting

Table 3.3 shows strategies used to cleanse and preprocess data strategies before forecasting
the reaction of stock market returns to changing economic activities in South Africa.

TABLE 3.3: Strategies used to cleanse and preprocess data strategies before
forecasting the reaction of stock market returns to changing economic activ-

ities in South Africa

Strategy Description
Redundant economic fea-
tures removal

There were duplicate economic features present in orig-
inal data sets extracted from the Federal Reserve Eco-
nomic Data (FRED) database and South African Re-
serve Bank (SARB) database. Duplicate features were
removed before combining data sets for redundancy
avoidance in analysis.

Data labeling Economic features extracted from the FRED database
had economic features codes as column names. Column
names were renamed to their official economic feature
names.

Data concatenation Economic features came from different data sources
(i.e., the FRED database, SARB database, and JSE
database). Economic features were concatenated to
complete the data set.

Indexing The index (date) of Economic features extracted from
the FRED database. Stock market returns had a format
that was incompatible with platform and model require-
ments. The format was set to a Year-Month-Day format.

Sample period specification Economic features had varying sample periods. Sam-
ple periods were set over 22 years (from 2002 to 2022),
because the FTSE/JSE all-share index price was estab-
lished in 2002 by the FTSE and JSE.

Separator conversion Economic features extracted from the SARB database
contained comma separators. The conversion from a
comma separator to a decimal point separator, so fea-
tures adhere to platform and model requirements.

Infinity conversion Certain economic features contained negative and pos-
itive infinity values of a float type. Infinity values of a
float type were converted to NaN (missing feature val-
ues), then replaced using the k nearest neighbor data
imputation strategy.

Feature stationarity by or-
der differencing

Features were made stationary using the order differen-
tiation strategy.

Continued on next page


31
Table 3.3 – Continued from previous page

Strategy Description
Data imputation Economic features contained missing feature values.

The k nearest neighbor data imputation strategy re-
placed missing feature values with values in proximity
to them.

Outlier replacement Economic features had varying sample periods. In this
manner, features contained outliers. The median outlier
replacement strategy replaced outliers with the median
value.

Data partitioning The random data partitioning strategy ensured models
learn economic activities and forecast the reaction of
stock market returns in South Africa. This strategy was
adopted with varying split ratios to compare the perfor-
mance of models over four periods.

Data dimension reduction The principal components analysis method reduced di-
mensions of economic activities in South Africa into
meaningful eigenvectors. The principal component
analysis method produced an index that offers insight
into the structure of the economy in South Africa based
on the explained variance ratio.

Data standardization Economic features came in different scales and quanti-
ties. The data standardization scaling strategy contains
economic activities on a standard scale.

3.2 The Strategy used to Impute Missing Economic Data of South
Africa

Most machine learning models cannot handle missing values [Rubin 1975], hence Leke and
Marwala [2019]; Thulare et al. [2021] suggested the use of data imputation strategies. This
research used the k nearest neighbor imputation strategy to replace missing values (subsec-
tion 3.2.1).

3.2.1 The k Nearest Neighbor Imputation Strategy

Previous research suggested the use of the k nearest neighbor imputation strategy to impute
missing values of economic features with values in proximity using the Euclidean distance
(d) method [Zhang 2012; Mulaudzi and Ajoodha 2020]. Equation 3.1 defines the Euclidean
d method:

d (xi, yi) =

√
n

∑
i=1

(xi − yi)2, (3.1)

where d denotes the distance between feature values, n denotes the number of xi, whereas xi
and yi denote feature values in a Euclidean space.


32
3.3 The Strategy used to Replace Outliers in Economic and Stock

Market Returns Data of South Africa

Outliers pose a problem in analysis, because some models are sensitive to outliers. The re-
search uses the median outlier replacement strategy to replace outliers with the median value
(subsection 3.3.1).

3.3.1 The Median Outlier Replacement Strategy

The median outlier replacement strategy uses the median value to replace outliers (i.e., xi <
5% percentile or xi > 95% percentile) in the sample probability distribution. Equation 3.2
defines the median (M), where n is odd:

M =
n + 1

2
, (3.2)

where n denotes the sample size. Equation 3.3 defines M, where n is even:

M =
( n

2 ) + ( n
2 + 1)

2
. (3.3)

3.4 The Strategy used to Reduce Economic Data Dimensions of
South Africa

The curse of data dimensionality problem frequently leads to an upsurge in the amount of
power required for computation [Young 2020]. This research used the principal compo-
nents analysis method to reduce dimensions of economic activities in South Africa (subsec-
tion 3.4.1).

3.4.1 The Principal Components Analysis Method

The principal component analysis method, an unsupervised machine learning method, re-
duces feature sets into meaningful eigenvectors. This method is suitable for addressing the
curse of data dimensionality problem [Shen 2009]. Equation 3.4 defines the principal com-
ponent analysis method:

n

∑
i=1

1
m

(xt − s̄) (xi − x̄)T, (3.4)

where x̄ emanates from eigenvectors.

Interestingly, the research ranks economic activities based on the explained variance ratio
(σ2

i ) value to produce an index that offers insight into the structure of the economy in South
Africa. Equation 3.5 defines σ2

i :

σ2
i =

n

∑
i=1

v̂ (x̃i)

v̂ (xi)
, (3.5)


33
where v̂ denotes the explained variance ratio. A scree plot identifies the number of compo-
nents needed to complete the principal component analysis method. The starting point of
eigenvalue misrepresentation defines the n component selection criterion.

3.5 The Strategy used to Partition Economic and Stock Market
Returns Data of South Africa

This research used the random data partitioning strategy to partition data, so each feature
value has an odd chance of being included in the selection (subsection 3.5.1). It considers
four split ratios to identify the performance of machine learning models over four periods.

3.5.1 The Random Data Partitioning Strategy

The random data partitioning strategy designates feature values to either the training or test
data set through a probabilistic process. Estimating the population proportion establishes the
confidence interval. Equation 3.6 defines the population proportion:

p̂i ± z
n

∑
i=1

p̂i (1 − p̂i)

n
, (3.6)

where p̂i denotes sample proportion estimates, n denotes the sample size, and z denotes the
critical value. Equation 3.7 defines the mean value of the sample proportion:

µ̂i = p̂i. (3.7)

Equation 3.8 defines σ ( p̂i):

σ ( p̂i) =
n

∑
i=1

p̂i (1 − p̂i)

n
. (3.8)

Equation 3.9 defines the confidence interval (CI) of the sample proportion:

CI = p̂i

(
−z <

n

∑
i=1

p̂i (1 − p̂i)

n
< z

)
, (3.9)

where 0 < CI < 1 and ± z denotes critical values.

3.6 The Strategy used to Scale Economic Data of South Africa

Machine learning models require features to fit in a scale before modeling, so objective func-
tions perform calculations efficiently. This research used the data standardization scaling
strategy to designate feature values in the training data set into a standard scale to ensure
values are contained in a standard scale for apt model comparison (subsection 3.6.1).


34
3.6.1 The Data Standardization Scaling Strategy

The data standardization scaling strategy alters the probability distribution, whereby µi = 0,
σi = 1, and xi is in proximity to 0, so features proceed in a Gaussian process (or follow
a normal distribution) [Tabak 2004]. Equation 3.10 defines the data standardization scaling
strategy:

Scalestandard =
n

∑
i=1

xi − x̂i

σi
, (3.10)

where xi denotes feature values, and σ (xi) denotes the divergence of xi from x̄i. Equa-
tion 3.11 defines σi:

σ (xi) =

√
∑n

i=1
(xi − x̄)2

n − 1
. (3.11)

3.7 Experiments for Model Comparison

Table 3.4 shows various experiments considered for model comparison.

TABLE 3.4: Experiments considered when determining whether machine
learning models perform better than the benchmark model when forecasting
the reaction of stock market returns to changing economic activities in South

Africa

Scenario Description Pr
in

ci
pa

lC
om

po
ne

nt
A

na
ly

si
s

R
eg

re
ss

io
n

Tr
ee

-b
as

ed

N
eu

ra
lN

et
w

or
k

Ve
ct

or
A

ut
or

eg
re

ss
iv

e

Continued on next page


35
Table 3.4 – Continued from previous page

Scenario Description Pr
in

ci
pa

lC
om

po
ne

nt
A

na
ly

si
s

R
eg

re
ss

io
n

Tr
ee

-b
as

ed

N
eu

ra
lN

et
w

or
k

Ve
ct

or
A

ut
or

eg
re

ss
iv

e

Scenario 1 Investigate the dynamics of the economy and how stock
market returns react to changing economic activities
in South Africa using the default vector autoregressive
model, along with default supervised machine learn-
ing models (i.e., the ordinary least-squares regression
model, ridge model, least absolute shrinkage and se-
lection operator model, elastic net model, decision tree
model, random forest tree model, extreme gradient
boosting tree model, recurrent neural network, gated re-
current unit, long-short term memory, restricted Boltz-
mann machine, and multi-layer perceptron).
Following that, compare the performance of default su-
pervised machine learning models against the perfor-
mance of the benchmark model (the default vector au-
toregressive model) when forecasting the reaction of
stock market returns to changing economic activities in
South Africa, as measured by the MAPE metric.

* * * *

Scenario 2 Present feature subsets containing key economic fea-
tures selected based on the gini impurity value calcu-
lated by tree-based models.
Concurrently, compare the performance of models
trained on a feature subset to the performance of models
trained on whole features when forecasting the reaction
of stock market returns to changing economic activities
in South Africa, as measured by the MAPE metric.

* * *

Scenario 3 Produce low economic data dimensions found by the
principal component analysis method.
Subsequently, compare the performance of models
trained on low economic data dimensions to the per-
formance of models trained on high economic data di-
mensions when forecasting the reaction of stock market
returns to changing economic activities in South Africa,
as measured by the MAPE metric.

* * * *


36
3.8 Strategies used to Regularize Regression Models

Machine learning models operate a ι1 norm function or ι2 norm function (or both) to confine
model parameters by introducing a penalty term. The ridge model regularization strategy
used the ι2−norm function, least absolute shrinkage and selection operator model regulariza-
tion strategy used the ι1−norm function, and elastic net model regularization strategy used the
ι2−norm function and ι2−norm function.

3.8.1 The Ridge Model Regularization Strategy

The ridge model regularization strategy, equally known as the ι2 norm model regularization
strategy, stabilizes the bias and variance (var) by applying a strengthening parameter similar
to weights of βt while learning predictor feature values c. Equation 3.12 defines the ridge
model regularization strategy:

β̂i =
n

∑
i=1

(
ŷi −

n

∑
j=1

β̂ jxi j

)2

+ λ
n

∑
j=1

|β̂ j|, (3.12)

where λ denotes the penalty term, leading to Equation 3.13:

β̂i =
(

x
′
ixi + λI

)−1 (
x
′
i ŷi

)
. (3.13)

The penalty term penalizes a machine learning model, as it commits errors when learning
predictor feature values and forecasting concealed target feature values. Equation 3.14 de-
fines the bias:

bias = λ
(

x
′
ixi + λI

)−1
β̂i. (3.14)

Equation 3.15 defines the var:

var = λ
(

x
′
ixi + λI

)−1
, x

′
ixi

(
x
′
ixiλI

)−1
. (3.15)

3.8.2 The Least Absolute Shrinkage and Selection Operator Model Regular-
ization Strategy

The least absolute shrinkage and selection operator model regularization strategy, equally
known as the ι1 norm model regularization strategy, enhances the performance of machine
learning models by normalizing parameters of models and penalizing model residuals [Tib-
shirani 1996]. Equation 3.16 defines the least absolute shrinkage and selection operator
model regularization strategy:

β̂i =
n

∑
i=1

(
ŷi −

n

∑
j=1

β̂ jxi j

)2

+ λ
n

∑
j=1

(
β̂ j
)2

. (3.16)


37
3.8.3 The Elastic Net Model Regularization Strategy

The elastic net regularization strategy bundles the λ of the ridge model regularization strat-
egy and least absolute shrinkage and selection operator model regularization strategy to man-
age bias and var, provided a limited n with higher dimensions [Zou and Hastie 2005a]. It
achieves this by eliminating invaluable features and prioritizing noteworthy features, and
contains a quadratic function (λ) [Zou and Hastie 2005b; Meier et al. 2008].

Equation 3.17 defines the elastic net model regularization strategy:

β̂i =

n
∑
i=1

(
ŷt − x

′
i β̂ j

)2

2n
+ λ

(
1 − α

2

n

∑
j=1

β̂i

)2

α
n

∑
j=1

|β̂ j|, (3.17)

where α = 0 for the least absolute shrinkage and selection operator model regularization
strategy, and α = 1 for the ridge model regularization strategy.


38
3.9 The Metric used to Evaluate the Performance of Models

This research identifies errors models commit when forecasting reactions of stock market
returns to evaluate model performance. The mean absolute percentage error metric (sub-
section 3.9.1) evaluates the benchmark model (the vector autoregressive model), along with
candidate machine learning models, when forecasting the reaction of stock market returns to
changing economic activities in South Africa.

3.9.1 The Mean Absolute Percentage Error Metric

Compared to other regression model performance metrics (i.e., the mean squared error met-
ric, root mean squared error metric, and mean absolute error metric, among other regression
model performance metrics), the mean absolute percentage error (MAPE) metric or mean
absolute percentage deviation (MAPD) metric is useful in evaluating the performance of
regression-based models when the data set contains temporal features, since the metric is
intuitive in model interpretation [Swamidass 2000].

The MAPE metric denotes the divergence ratio of ŷi from yi. It does not consider positive
divergence ratio values or negative divergence ratio values, and it helps prevent a scenario in
which positive errors and negatives do not revoke each other, which is common with alter-
native regression-based model performance metrics [Myttenaere et al. 2016]. Equation 3.18
defines the MAPE metric:

MAPE =
100%

n

n

∑
i=1

∣∣yi − ŷi

ŷi

∣∣. (3.18)

The MAPE metric serves as the primary model performance metric in this research. A model
with a MAPE value in proximity to 100% is ideal. Investors may use the MAPE metric to
inform investing decisions.

If the MAPE value is incredibly high, they may reconsider investing in the stock market in
South Africa or exclude the index from their portfolio. Else, if the MAPE value is exception-
ally low, they can consider investing in the stock market in South Africa or include the index
in their portfolio.

3.10 Ethical Consideration

This research extracts monthly economic features from FRED and SARB databases, and the
monthly stock market price in South Africa from the JSE database. All databases are for gen-
eral access and academic use. This research does not require an ethics clearance certificate,
because all features come from secondary data sources.


39
3.11 Research Methods Summary

Table 3.5 shows links between research questions and research methods.

TABLE 3.5: Research methods summary

Research Question Research Method
How do stock market returns react
to changing economic activities in
South Africa?
Do machine learning models perform
better than the benchmark model (the
vector autoregressive model) when
forecasting the reaction of stock mar-
ket returns to changing economic ac-
tivities in South Africa, as measured
by the MAPE metric?

Investigate the dynamics of the economy and how stock
market returns react to changing economic activities
in South Africa using the default vector autoregressive
model, along with default supervised machine learn-
ing models (i.e., the ordinary least-squares regression
model, ridge model, least absolute shrinkage and se-
lection operator model, elastic net model, decision tree
model, random forest tree model, extreme gradient
boosting tree model, recurrent neural network, gated re-
current unit, long-short term memory, restricted Boltz-
mann machine, and multi-layer perceptron).
Following that, compare the performance of default su-
pervised machine learning models against the perfor-
mance of the benchmark model (the default vector au-
toregressive model) when forecasting the reaction of
stock market returns to changing economic activities in
South Africa, as measured by the MAPE metric.

Do models trained on a feature sub-
set containing key economic features
selected based on the gini impurity
value calculated by tree-based mod-
els perform better than the models
trained on whole features when fore-
casting the reaction of stock market
returns to changing economic activi-
ties in South Africa, as measured by
the MAPE metric?

Present feature subsets containing key economic fea-
tures selected based on the gini impurity value calcu-
lated by tree-based models.
Concurrently, compare the performance of models
trained on a feature subset to the performance of models
trained on whole features when forecasting the reaction
of stock market returns to changing economic activities
in South Africa, as measured by the MAPE metric.

Is the performance of models trained
on low economic data dimensions
(found by the principal compo-
nent analysis method) distinguish-
able from the performance of mod-
els trained on high economic data di-
mensions when forecasting the reac-
tion of stock market returns to chang-
ing economic activities in South
Africa, as measured by the MAPE
metric?

Produce low economic data dimensions found by the
principal component analysis method.
Subsequently, compare the performance of models
trained on low economic data dimensions to the per-
formance of models trained on high economic data di-
mensions when forecasting the reaction of stock market
returns to changing economic activities in South Africa,
as measured by the MAPE metric.


40

4
Experiment Results & Discussions

This research investigated the dynamics of the economy and how stock market returns react
to changing economic activities in South Africa. It considered various experiments, i.e.,
different feature subsets and data dimensions, to determine whether machine learning models
perform better than the benchmark model (the vector autoregressive model) when forecasting
the reaction of stock market returns.

Download the source code and data of the research project here:

http://www.github.com/tshepochris

4.1 The Exploration of the Distribution of Stock Market Returns
Data in South Africa

Figure 4.1 shows the distribution of stock market returns in South Africa of the same sample
period.

(A) Series (B) Series Distribution

FIGURE 4.1: The distribution of stock market returns data in South Africa

Figure 4.1 (A) shows reversion or steady stock market returns that calm the market after
extreme lows or highs from 2002 to 2022. (B) shows stock market returns proceed in a
Gaussian process.

Table 4.1 shows the p value of the augmented Dickey-Fuller (ADF) test, which infers stock
market returns stationarity (at α = 0.05), where a p value < 0.05 denotes the stock market re-
turns in South Africa are stationary, and p value > 0.05 denotes the feature is non-stationary.

http://www.github.com/tshepochris


41
It also shows the central tendency and divergence of stock market returns in South Africa
(i.e., the mean value, standard deviation value, skew value, and kurtosis value).

TABLE 4.1: The descriptive statistics of stock market returns data in South
Africa

Mean Std Skew Kurtosis ADF p-value
Stock market returns in
South Africa

1.0105 0.0113 0.3357 -0.4565 0.0055

Table 4.1 shows the mean stock market re