Out-of-plane action unit recognition using recurrent neural networks
The face is a fundamental tool to assist in interpersonal communication and interaction between people. Humans use facial expressions to consciously or subconsciously express their emotional states, such as anger or surprise. As humans, we are able to easily identify changes in facial expressions even in complicated scenarios, but the task of facial expression recognition and analysis is complex and challenging to a computer. The automatic analysis of facial expressions by computers has applications in several scientific subjects such as psychology, neurology, pain assessment, lie detection, intelligent environments, psychiatry, and emotion and paralinguistic communication. We look at methods of facial expression recognition, and in particular, the recognition of Facial Action Coding System’s (FACS) Action Units (AUs). Movements of individual muscles on the face are encoded by FACS from slightly different, instant changes in facial appearance. Contractions of specific facial muscles are related to a set of units called AUs. We make use of Speeded Up Robust Features (SURF) to extract keypoints from the face and use the SURF descriptors to create feature vectors. SURF provides smaller sized feature vectors than other commonly used feature extraction techniques. SURF is comparable to or outperforms other methods with respect to distinctiveness, robustness, and repeatability. It is also much faster than other feature detectors and descriptors. The SURF descriptor is scale and rotation invariant and is unaffected by small viewpoint changes or illumination changes. We use the SURF feature vectors to train a recurrent neural network (RNN) to recognize AUs from the Cohn-Kanade database. An RNN is able to handle temporal data received from image sequences in which an AU or combination of AUs are shown to develop from a neutral face. We are recognizing AUs as they provide a more fine-grained means of measurement that is independent of age, ethnicity, gender and different expression appearance. In addition to recognizing FACS AUs from the Cohn-Kanade database, we use our trained RNNs to recognize the development of pain in human subjects. We make use of the UNBC-McMaster pain database which contains image sequences of people experiencing pain. In some cases, the pain results in their face moving out-of-plane or some degree of in-plane movement. The temporal processing ability of RNNs can assist in classifying AUs where the face is occluded and not facing frontally for some part of the sequence. Results are promising when tested on the Cohn-Kanade database. We see higher overall recognition rates for upper face AUs than lower face AUs. Since keypoints are globally extracted from the face in our system, local feature extraction could provide improved recognition results in future work. We also see satisfactory recognition results when tested on samples with out-of-plane head movement, showing the temporal processing ability of RNNs.
A dissertation submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in fulfilment of requirements for the degree of Master of Science. Johannesburg, 2015.