Out-of-plane action unit recognition using recurrent neural networks
Files
Date
2015-05-20
Authors
Trewick, Christine
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The face is a fundamental tool to assist in interpersonal communication and interaction between people.
Humans use facial expressions to consciously or subconsciously express their emotional states, such as
anger or surprise. As humans, we are able to easily identify changes in facial expressions even in complicated
scenarios, but the task of facial expression recognition and analysis is complex and challenging
to a computer. The automatic analysis of facial expressions by computers has applications in several scientific
subjects such as psychology, neurology, pain assessment, lie detection, intelligent environments,
psychiatry, and emotion and paralinguistic communication. We look at methods of facial expression
recognition, and in particular, the recognition of Facial Action Coding System’s (FACS) Action Units
(AUs). Movements of individual muscles on the face are encoded by FACS from slightly different, instant
changes in facial appearance. Contractions of specific facial muscles are related to a set of units
called AUs. We make use of Speeded Up Robust Features (SURF) to extract keypoints from the face and
use the SURF descriptors to create feature vectors. SURF provides smaller sized feature vectors than
other commonly used feature extraction techniques. SURF is comparable to or outperforms other methods
with respect to distinctiveness, robustness, and repeatability. It is also much faster than other feature
detectors and descriptors. The SURF descriptor is scale and rotation invariant and is unaffected by small
viewpoint changes or illumination changes. We use the SURF feature vectors to train a recurrent neural
network (RNN) to recognize AUs from the Cohn-Kanade database. An RNN is able to handle temporal
data received from image sequences in which an AU or combination of AUs are shown to develop from
a neutral face. We are recognizing AUs as they provide a more fine-grained means of measurement that
is independent of age, ethnicity, gender and different expression appearance. In addition to recognizing
FACS AUs from the Cohn-Kanade database, we use our trained RNNs to recognize the development
of pain in human subjects. We make use of the UNBC-McMaster pain database which contains image
sequences of people experiencing pain. In some cases, the pain results in their face moving out-of-plane
or some degree of in-plane movement. The temporal processing ability of RNNs can assist in classifying
AUs where the face is occluded and not facing frontally for some part of the sequence. Results are
promising when tested on the Cohn-Kanade database. We see higher overall recognition rates for upper
face AUs than lower face AUs. Since keypoints are globally extracted from the face in our system, local
feature extraction could provide improved recognition results in future work. We also see satisfactory
recognition results when tested on samples with out-of-plane head movement, showing the temporal
processing ability of RNNs.
Description
A dissertation submitted to the Faculty of Science, University of the Witwatersrand, Johannesburg, in fulfilment of requirements for the degree of Master of Science. Johannesburg, 2015.