Abstract
Facial animation has been combined with text-to-speech synthesis to create innovative multimodal interfaces, such as Web stores and Web-based customer services. This paper presents an image-based facial animation system using active appearance models (AAM) for precisely detecting feature points in the human face, which are required for selecting mouth images from the database of the face model. In order to minimize the impact of human error when creating the training data, a new optimization method for building an AAM is produced. Optimized training set reduces the average feature point location error from 1.15 pixels to 0.17 pixel. The feature points are suitable for automatic morphing between mouth images with large visual differences. Subjective tests show that morphing improves the visual quality of the animation from "fair" to "good".