Partially automated grading of short free-text responses in computer science through sentence embedding and clustering

dc.contributor.authorPhilip, Sheena
dc.date.accessioned2024-02-06T07:22:47Z
dc.date.available2024-02-06T07:22:47Z
dc.date.issued2024
dc.descriptionA research report submitted in partial fulfilment of the requirements for the degree Master of Science (e-Science) to the Faculty of Science, School of Computer Science and Applied Mathematics, University of the Witwatersrand, Johannesburg, 2023
dc.description.abstractA significant portion of educators‘ time is spent marking assessments, which could be better utilized for teaching and research to enhance the overall education experience. To assess higher-order thinking, questions that require short text answers are necessary. However, automatically grading these questions is much more complex since computers need to understand the underlying semantic meaning of the text. Furthermore, the dataset available for grading is limited to a few hundred responses due to the smaller size of lecture classes, which is not sufficient for evaluating most NLP and machine learning methods. To address this, this research aims to partially automate the grading of short free-text responses in computer science by grouping similar responses and manually marking specific submissions that best represent the group. It will explore various sentence embedding techniques, clustering techniques, and sampling techniques, and evaluate the Enhancement of Clustering by Iterative Classification (ECIC) algorithm, which improves cluster quality. The study found that Agglomerative clustering combined with Universal Sentence Encoder (USE) and a sampling strategy that marks submissions based on their distance to the center of the cluster produced the best results, balancing time saved and meeting the performance criteria. This combination resulted in a 65% reduction in the time it takes to grade a question. However, the ECIC algorithm was not effective on datasets that comprises a few hundred data points.
dc.description.librarianTL (2024)
dc.description.sponsorshipNational e-Science Postgraduate Teaching and Training Platform
dc.facultyFaculty of Science
dc.identifier.urihttps://hdl.handle.net/10539/37505
dc.language.isoen
dc.schoolComputer Science and Applied Mathematics
dc.subjectAutomation
dc.subjectGrading
dc.subjectAlgorithm
dc.titlePartially automated grading of short free-text responses in computer science through sentence embedding and clustering
dc.typeDissertation

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
MSc_final_dissertation.pdf
Size:
1.23 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.43 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections