Facial action unit classification using weakly supervised learning

Enabor, Oseluole Tobi

Facial action unit classification using weakly supervised learning

dc.contributor.author	Enabor, Oseluole Tobi
dc.date.accessioned	2024-02-06T07:48:30Z
dc.date.available	2024-02-06T07:48:30Z
dc.date.issued	2024
dc.description	A research report submitted in fulfilment of the requirements for the degree of Master of Science to the Faculty of Science, School of Computer Science and Applied Mathematics, University of the Witwatersrand, Johannesburg, 2023
dc.description.abstract	Deep learning has gained popularity because of its supremacy in terms of performance when trained on large datasets. However, collecting and annotating large datasets is laborious, expensive, and time-consuming. Weak supervision learning (WSL) has been at the forefront in exploring solutions to the above limitations. WSL techniques can create accurate classifiers under different scenarios, such as limited sample datasets, inaccurate datasets with noisy labels, and datasets that do not have the desired labels. This work applies WSL to facial Action Unit (AU) recognition, a problem space that relies on subject-matter experts (i.e., certified Facial Action Unit Coders (FACS)) to annotate samples. Two WSL techniques, namely incomplete supervision using a pseudolabelling mechanism, where one has access to vast amounts of unlabelled data and a limited amount of labelled data, and inaccurate supervision using Large-Loss Rejection (LLR) mechanism, where one has access to only noisy labels, were explored. The pseudo-labelling mechanism involves feeding samples with generated pseudo-labels during the training process. Alternatively, the LLR mechanism prevents model learning noisy labels by rejecting samples that reported large-loss during training. To better evaluate the limitations posed by accurate data and label availability and its impact on training models, the authors trained a baseline emotion recognition model and finetuned for AU recognition using transfer learning. This process also helped access the ability to estimate fine-grain labels (AUs) using only coarse-grain labels (facial emotions). The experimental setup included training and validating a VGG16 Convolutional neural network (CNN) using the Extended Denver Intensity of Spontaneous Facial Action Database (DISFA+) and the use of the Karolinska Directed Emotional Faces (KDEF) dataset as cross-dataset evaluation. Pseudo-labelling approach for AU recognition had three models, the first, PL-1, reported subset accuracy of 68% and 0.56 weighted F1- score, PL-2a reported a subset accuracy 89% and 0.9 weighted F1-score, PL-2b reported a subset accuracy of 66% and a weighted F1-score of 0.44. The LLR approach for AU recognition reported a subset accuracy of 69% and a weighted average F1-score of 0.66. The baseline AU model reported accuracy of 97% and an F1-score of 0.98 for AU recognition, signifying the need for large data sets and transfer learning. However, with an average reported accuracy of 68.5%, WSL mechanisms provide a solution in the right direction and can assist researchers in addressing data annotation challenges
dc.description.librarian	TL (2024)
dc.faculty	Faculty of Science
dc.identifier.uri	https://hdl.handle.net/10539/37508
dc.language.iso	en
dc.school	Computer Science and Applied Mathematics
dc.subject	Weakly supervised learning
dc.subject	Machine learning
dc.subject	Facial expression recognition
dc.title	Facial action unit classification using weakly supervised learning
dc.type	Dissertation

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Oseluole Enabor 558039 MSc_Final_Dissertation_20230610.pdf
Size:: 13.29 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.43 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

ETD Collection