Analysis of type 1 diabetes verbal autopsy data by machine learning techniques

dc.contributor.authorManaka, Thokozile
dc.date.accessioned2020-09-14T08:02:29Z
dc.date.available2020-09-14T08:02:29Z
dc.date.issued2019
dc.descriptionA dissertation submitted to the University of the Witwatersrand in accordance with the requirements of the degree of MASTERS in the Faculty of Science. February 2019en_ZA
dc.description.abstractBig data is a term used for data sets with large, diverse and complex structures that are often quite difficult to analyze or visualize using traditional computing methods and approaches. Machine learning (ML) techniques are effective in analyzing these types of data and extracting information from them. Health care systems generate large amounts of data from record keeping and this supports a wide range of medical decisions like population health surveillance and disease management for the overall improvement of the quality of health care delivery. In areas where there are no health registration systems a method of verbal autopsy is relied on to give information of a likely cause of death. In this study type 1 diabetes (T1DM) verbal autopsy data from MRC/Wits Rural Public Health Transitions Research Unit (Agincourt) was used as a test case for applying modern machine learning classification techniques to ascertain the cause of death by type 1 diabetes. Machine learning techniques used for the classification task were artificial neural networks (ANNs) and random forests which are realized with a keras front end and tensor flow. Machine learning algorithms automatically learn to make accurate predictions based on past observations by learning patterns in the data for this study, they learned features present in patients with diabetes and were able to identify patients who could have died from the disease. This is the first study on type 1 diabetes verbal autopsy data by the two machine learning techniques in South Africa. Performance metrics like precision, recall, confusion matrix were used for these classifiers because the data was incredibly skewed and the results obtained show that the random forest classifier classified the deaths by diabetes better than the artificial neural network. In particular the roc-score compares favourably with the study that was done by two clinician specialists in the disease whose study was similaren_ZA
dc.description.librarianPH2020en_ZA
dc.facultyFaculty of Scienceen_ZA
dc.identifier.urihttps://hdl.handle.net/10539/29596
dc.language.isoenen_ZA
dc.schoolSchool of Physicsen_ZA
dc.titleAnalysis of type 1 diabetes verbal autopsy data by machine learning techniquesen_ZA
dc.typeThesisen_ZA

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Final Thesis 2.pdf
Size:
4.96 MB
Format:
Adobe Portable Document Format
Description:
Final work

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections