Galaxy classification using machine learning

dc.contributor.authorVariawa, Mohamed Zayyan
dc.date.accessioned2022-09-15T09:02:37Z
dc.date.available2022-09-15T09:02:37Z
dc.date.issued2021
dc.descriptionA research report submitted to the School of Computer Science and Applied Mathematics, Faculty of Science, University of Witwatersrand, in partial fulfillment of the requirements for the degree of Master of Science, 2021en_ZA
dc.description.abstractAn important area of study is galaxy classification, as the type and formation of galaxies often offer insights into the origin and evolution of the universe. The majority of classifications comes from human experts manually inspecting and labelling images of galaxies. Owing to the increased availability of images of galaxies, re-searchers coupled machine learning with crowd-sourced labels to automate the process of galaxy classification to save time spent by astronomers performing manual classification. However, studying the generalisation of these crowd-sourced labels to more expert classification systems like the Hubble tuning fork is essential. Multiple ResNet50 models are trained on the crowd-sourced Galaxy Zoo 1 and 2 datasets as well as the expertly labelled EFIGI catalogue to classify galaxies according to their Hubble types. To study the generalisation of the models trained on crowd-sourced data against the models trained on expert data, the expert Revised Shapley-Ames catalogue is used an unseen test set. Deep Metric Learning techniques are used to fine-tune classification models to improve on the current state-of-the-art results for classifying galaxies. The results show that Transfer Learning coupled with the ResNet50 outperforms self-defined rules for galaxy classification, indicating the effectiveness of machine learning for galaxy classification. The results further demonstrate an improvement on the current state-of-the-art accuracy for both the Galaxy Zoo 2 and EFIGI data, using Transfer Learning with the ResNet-50 model. The mean average precision values for both the crowd-sourced and expert models indicated that the models are comparable. However, confusion matrices reveal that the models trained on the expert dataset outperformed the models trained on the crowd-sourced data in terms of actual vs. predicted labels. The result highlights the need for caution when utilising crowd-sourced labels. The results further show that a model that has been pre-trained on crowd-sourced data using Label Smoothing Cross-Entropy can be fine-tuned using Deep Metric Learning to achieve the state-of-the-art performance in galaxy morphology classification. Finally, Transfer Learning from crowd-sourced labelled data to expert-labelled data leads to significant improvement in classification accuracyen_ZA
dc.description.librarianCK2022en_ZA
dc.facultyFaculty of Scienceen_ZA
dc.identifier.urihttps://hdl.handle.net/10539/33199
dc.language.isoenen_ZA
dc.schoolSchool of Computer Science and Applied Mathematicsen_ZA
dc.titleGalaxy classification using machine learningen_ZA
dc.typeThesisen_ZA

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
MSCDissertation - 852648 - Mohamed Zayyan Variawa (1).pdf
Size:
5.75 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections