Classifying cancerous tumours using machine learning techniques

dc.contributor.authorGong, Jingyuan
dc.date.accessioned2023-02-01T07:00:07Z
dc.date.available2023-02-01T07:00:07Z
dc.date.issued2022
dc.descriptionA dissertation submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in ful lment of the requirements for the degree of Master of Science
dc.description.abstractCancer has become a leading cause of death in the modern world. Literature suggests that in the modern world, a third of the population will develop cancer during their lifetime. The focus of the dissertation was to classify tumours as malignant or benign tumours. The data was obtained via the Surveillance, Epidemiology and End Results (SEER) program, which collected data from 1973 to 2018. The SEER program gives data on cancer incidences obtained from the United States population and represents 28% of the United States population. The data set contained variables such as age, race, sex, year of diagnosis and tumour classification, along with 14 other variables. The methods used for modeling were K-Nearest Neighbours (KNN), Weighted K-Nearest Neighbours, Artificial Neural Networks, Naive Bayes classifier and Bayesian Neural Networks. All models above used Synthetic Minority Oversampling Techniques (SMOTE), as the data set was imbalanced with a ratio of 40 to 1 for the malignant tumours. The best model for the data set was the KNN model with five neighbours and SMOTE application, with an area under the curve (AUC) of 0.781.
dc.description.librarianCK2023
dc.facultyFaculty of Science
dc.identifier.urihttps://hdl.handle.net/10539/34363
dc.language.isoen
dc.titleClassifying cancerous tumours using machine learning techniques
dc.typeDissertation
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Gong_JY_2379445_Dissertation.pdf
Size:
2.81 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.43 KB
Format:
Item-specific license agreed upon to submission
Description:
Collections