Abstract
This paper describes a novel graphical model approach to seamlessly coupling and simultaneously analyzing facial emotions and the action units. Our method is based on the hidden conditional random fields (HCRFs) where we link the output class label to the underlying emotion of a facial expression sequence, and connect the hidden variables to the image frame-wise action units. As HCRFs are formulated with only the clique constraints, their labeling for hidden variables often lacks a coherent and meaningful configuration. We resolve this matter by introducing a partially-observed HCRF model, and establish an efficient scheme via Bethe energy approximation to overcome the resulting difficulties in training. For real-time applications, we also propose an online implementation to perform incremental inference with satisfactory accuracy.