Facial action unit classification using weakly supervised learning

dc.contributor.authorEnabor, Oseluole Tobi
dc.date.accessioned2024-02-06T07:48:30Z
dc.date.available2024-02-06T07:48:30Z
dc.date.issued2024
dc.descriptionA research report submitted in fulfilment of the requirements for the degree of Master of Science to the Faculty of Science, School of Computer Science and Applied Mathematics, University of the Witwatersrand, Johannesburg, 2023
dc.description.abstractDeep learning has gained popularity because of its supremacy in terms of performance when trained on large datasets. However, collecting and annotating large datasets is laborious, expensive, and time-consuming. Weak supervision learning (WSL) has been at the forefront in exploring solutions to the above limitations. WSL techniques can create accurate classifiers under different scenarios, such as limited sample datasets, inaccurate datasets with noisy labels, and datasets that do not have the desired labels. This work applies WSL to facial Action Unit (AU) recognition, a problem space that relies on subject-matter experts (i.e., certified Facial Action Unit Coders (FACS)) to annotate samples. Two WSL techniques, namely incomplete supervision using a pseudolabelling mechanism, where one has access to vast amounts of unlabelled data and a limited amount of labelled data, and inaccurate supervision using Large-Loss Rejection (LLR) mechanism, where one has access to only noisy labels, were explored. The pseudo-labelling mechanism involves feeding samples with generated pseudo-labels during the training process. Alternatively, the LLR mechanism prevents model learning noisy labels by rejecting samples that reported large-loss during training. To better evaluate the limitations posed by accurate data and label availability and its impact on training models, the authors trained a baseline emotion recognition model and finetuned for AU recognition using transfer learning. This process also helped access the ability to estimate fine-grain labels (AUs) using only coarse-grain labels (facial emotions). The experimental setup included training and validating a VGG16 Convolutional neural network (CNN) using the Extended Denver Intensity of Spontaneous Facial Action Database (DISFA+) and the use of the Karolinska Directed Emotional Faces (KDEF) dataset as cross-dataset evaluation. Pseudo-labelling approach for AU recognition had three models, the first, PL-1, reported subset accuracy of 68% and 0.56 weighted F1- score, PL-2a reported a subset accuracy 89% and 0.9 weighted F1-score, PL-2b reported a subset accuracy of 66% and a weighted F1-score of 0.44. The LLR approach for AU recognition reported a subset accuracy of 69% and a weighted average F1-score of 0.66. The baseline AU model reported accuracy of 97% and an F1-score of 0.98 for AU recognition, signifying the need for large data sets and transfer learning. However, with an average reported accuracy of 68.5%, WSL mechanisms provide a solution in the right direction and can assist researchers in addressing data annotation challenges
dc.description.librarianTL (2024)
dc.facultyFaculty of Science
dc.identifier.urihttps://hdl.handle.net/10539/37508
dc.language.isoen
dc.schoolComputer Science and Applied Mathematics
dc.subjectWeakly supervised learning
dc.subjectMachine learning
dc.subjectFacial expression recognition
dc.titleFacial action unit classification using weakly supervised learning
dc.typeDissertation
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Oseluole Enabor 558039 MSc_Final_Dissertation_20230610.pdf
Size:
13.29 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.43 KB
Format:
Item-specific license agreed upon to submission
Description:
Collections