Statistical and deep learning methods in causal inference

Whata, Albert

Statistical and deep learning methods in causal inference

dc.contributor.author	Whata, Albert
dc.date.accessioned	2022-07-29T07:56:33Z
dc.date.available	2022-07-29T07:56:33Z
dc.date.issued	2021
dc.description	A thesis submitted to the Faculty of Science, University of the Witwatersrand, in fulfilment of the requirements for the degree of Doctor of Philosophy, 2021	en_ZA
dc.description.abstract	Machine learning (ML) algorithms are excellent at predicting outcomes rather than explaining causality. On the other hand, deep learning algorithms such as deep neural networks (DNN) are especially good at uncovering some hidden patterns in large data sets, but they struggle when it comes to making simple causal inferences. Causal inference is a statistical tool that can be used by machine learning and deep learning to measure the causal effects of multiple variables. This research was carried out to show researchers that it is very important to start incorporating causal inference into machine learning systems and not to just focus on predicting outcomes. A propensity scores-potential outcomes framework was used to evaluate machine learning and statistical causal inference. Using the propensity scores-potential outcomes framework ,it was successfully demonstrated that a deep learning algorithm such as DNN can be adapted and used for the classification tasks. In addition, the results in this thesis have shown that using DNN, one can success-fully estimate propensity scores, and also reduce absolute bias in the treatment effects that are estimated using these propensity scores. A hybrid model that consisted of a long-short term memory autoencoder (LSTMAE) and the kernel quantile estimator (KQE) algorithm was also successfully developed to detect change-points. Additionally, a multivariate regression discontinuity design (MRDD) was effectively employed to evaluate the statistical causal effect using two assignment variables. Also, the study demonstrated the importance of accompanying every conventional or multivariate regression discontinuity design with supplementary analyses to give more credibility to the causal estimates. A hybrid deep learning algorithm that uses a convolutional neural network (CNN) as well as a bidirectional long-short term memory (Bi-LSTM) neural network was developed for the classification of the severe acute respiratory syndrome coronavirus 2 (SARS CoV-2) among Coronaviruses. The model achieved impressive on metrics such as classification accuracy, area under curve receiver operating characteristic (AUC ROC), and Cohen’s Kappa. The results show that deep learning algorithms can be used as alternative avenues to detect SARS CoV-2 among Coronaviruses	en_ZA
dc.description.librarian	CK2022	en_ZA
dc.faculty	Faculty of Science	en_ZA
dc.identifier.uri	https://hdl.handle.net/10539/33077
dc.language.iso	en	en_ZA
dc.phd.title	PhD	en_ZA
dc.school	School of Statistics and Actuarial Science	en_ZA
dc.title	Statistical and deep learning methods in causal inference	en_ZA
dc.type	Thesis	en_ZA

Files

Original bundle

Now showing 1 - 1 of 1

Name:: PHD Thesis_ver__5_1_A_Whata.pdf
Size:: 2.4 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

ETD Collection