AUTOMATED FACIAL ACTION CODING
aus recognition image automatic
Most approaches to automatic facial expression analysis attempt to recognize a small set of prototypic emotional facial expressions (i.e., fear, sadness, disgust, anger, surprise, and happiness) (for an exhaustive survey of the past work on this research topic, the reader is referred to the work of Pantic & Rothkrantz 2003). This practice may follow from the work of Darwin and more recently Ekman (Lewis & Haviland-Jones, 2000), who suggested that basic emotions have corresponding prototypic expressions. In everyday life, however, such prototypic expressions occur relatively rarely; emotions are displayed more often by subtle changes in one or few discrete facial features such as raising the eyebrows in surprise. To detect such subtlety of human emotions and, in general, to make the information conveyed by facial expressions available for usage in the various applications mentioned above, automatic recognition of rapid facial signals (AUs) is needed.
Few approaches have been reported for automatic recognition of AUs in images of faces. Some researchers described patterns of facial motion that correspond to a few specific AUs, but did not report on actual recognition of these AUs. Examples of such works are the studies of Mase (1991) and Essa and Pentland (1997). Almost all other efforts in automating FACS coding addressed the problem of automatic AU recognition in face video using both machine vision techniques like optical flow analysis, Gabor wavelets, temporal templates, particle filtering, and machine learning techniques such as neural networks, support vector machines, and hidden Markov models. To detect six individual AUs in face image sequences free of head motions, Bartlett et al. (1999) used a neural network. They achieved 91% accuracy by feeding the pertinent network with the results of a hybrid system combining holistic spatial analysis and optical flow with local feature analysis. To recognize eight individual AUs and four combinations of AUs with an average recognition rate of 95.5% for face image sequences free of head motions, Donato et al. (1999) used Gabor wavelet representation and independent component analysis. To recognize eight individual AUs and seven combinations of AUs with an average recognition rate of 85% for face image sequences free of head motions, Cohn et al. (1999) used facial feature point tracking and discriminant function analysis. Tian et al. (2001) used lip tracking, template matching, and neural networks to recognize 16 AUs occurring alone or in combination in nearly frontal-view face image sequences. They reported an 87.9% average recognition rate attained by their method. Braathen et al. (2002) reported on automatic recognition of three AUs using particle filtering for 3D tracking, Gabor wavelets, support vector machines, and hidden Markov models to analyze an input face image sequence having no restriction placed on the head pose. To recognize 15 AUs occurring alone or in combination in a nearly frontal-view face image sequence, Valstar et al. (2004) used temporal templates. Temporal templates are 2D images constructed from image sequences, which show where and when motion in the image sequence has occurred. The authors reported a 76.2% average recognition rate attained by their method.
In contrast to all these approaches to automatic AU detection, which deal only with frontal-view face images and cannot handle temporal dynamics of AUs, Pantic and Patras (2004) addressed the problem of automatic detection of AUs and their temporal segments (onset, apex, offset) from profile-view face image sequences. They used particle filtering to track 15 fiducial facial points in an input face-profile video and temporal rules to recognize temporal segments of 23 AUs occurring alone or in a combination in the input video sequence. They achieved an 88% average recognition rate by their method.
The only work reported to date that addresses automatic AU coding from static face images is the work of Pantic and Rothkrantz (2004). It concerns an automated system for AU recognition in static frontal- and/or profile-view color face images. The system utilizes a multi-detector approach for facial component localization and a rule-based approach for recognition of 32 individual AUs. A recognition rate of 86% is achieved by the method.
User Comments