Classifying cancerous tumours using machine learning techniques

Gong, Jingyuan

Classifying cancerous tumours using machine learning techniques

Files

Gong_JY_2379445_Dissertation.pdf (2.81 MB)

Date

2022

Authors

Gong, Jingyuan

Abstract

Cancer has become a leading cause of death in the modern world. Literature suggests that in the modern world, a third of the population will develop cancer during their lifetime. The focus of the dissertation was to classify tumours as malignant or benign tumours. The data was obtained via the Surveillance, Epidemiology and End Results (SEER) program, which collected data from 1973 to 2018. The SEER program gives data on cancer incidences obtained from the United States population and represents 28% of the United States population. The data set contained variables such as age, race, sex, year of diagnosis and tumour classification, along with 14 other variables. The methods used for modeling were K-Nearest Neighbours (KNN), Weighted K-Nearest Neighbours, Artificial Neural Networks, Naive Bayes classifier and Bayesian Neural Networks. All models above used Synthetic Minority Oversampling Techniques (SMOTE), as the data set was imbalanced with a ratio of 40 to 1 for the malignant tumours. The best model for the data set was the KNN model with five neighbours and SMOTE application, with an area under the curve (AUC) of 0.781.

Description

A dissertation submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in ful lment of the requirements for the degree of Master of Science

URI

https://hdl.handle.net/10539/34363

Collections

ETD Collection

Full item page

Classifying cancerous tumours using machine learning techniques

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By