Exploring the efficacy of popular clustering techniques on gene expression data

dc.contributor.authorBatista, S. TKS
dc.date.accessioned2021-04-27T16:02:24Z
dc.date.available2021-04-27T16:02:24Z
dc.date.issued2020
dc.descriptionA dissertation submitted in fulfilment of the requirements for the degree Master of Science, in the School of Computer Science and Applied Mathematics, Faculty of Science, University of the Witwatersrand, Johannesburg, 2020en_ZA
dc.description.abstractHigh throughput data has presented a wealth of genomic information, but as of yet a golden standard has not been presented and tested as means for the analysis of this data. Posing the question of whether biological function can be inferred solely from gene expression data of a host at different states. In-light of the lack of information that exits on the procedure to be employed in a true gene expression data exploratory process, a robust methodology was implemented. This included the use of a wide array of clustering algorithms along with numerous validation indices to attempt to discover the natural biological classes that existed within significantly unannotated data. While not being the most novel of the machine-learning techniques proposed for such data analysis, the k-means algorithm outperformed other methods when validated using known model validation techniques. The testing of the functional biological validity of these results were found to present a sufficiently accurate image of the underlying biological functions. These results while promising would require further validation via experimental methods to ensure the accuracy of the biological inferencesen_ZA
dc.description.librarianCK2021en_ZA
dc.facultyFaculty of Scienceen_ZA
dc.identifier.urihttps://hdl.handle.net/10539/31033
dc.language.isoenen_ZA
dc.schoolSchool of Computer Science and Applied Mathematicsen_ZA
dc.titleExploring the efficacy of popular clustering techniques on gene expression dataen_ZA
dc.typeThesisen_ZA

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
S_BATISTA_730221 Final Thesis.pdf
Size:
5.73 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections