Abstract
This paper exploits the discriminative power of manifold learning in conjunction with the parsimonious power of sparse signal representation to perform robust facial expression recognition. By utilizing an ℓ1 reconstruction error and a statistical mixture model, both accuracy and tolerance to occlusion improve without the need to perform neutral frame subtraction. Initially facial features are mapped onto a low dimensional manifold using supervised Locality Preserving Projections. Then an ℓ1 optimization is employed to relate surface projections to training exemplars, where reconstruction models on facial regions determine the expression class. Experimental procedures and results are done in accordance with the recently published extended Cohn-Kanade and GEMEP-FERA datasets. Results demonstrate that posed datasets overemphasize the mouth region, while spontaneous datasets rely more on the upper cheek and eye regions. Despite these differences, the proposed method overcomes previous limitations to using sparse methods for facial expression and produces state-of-the-art results on both types of datasets.