2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA)
Download PDF

Abstract

Emotion conversion using a small speech corpus is very important for expressive text to speech systems. Applying the unit selection paradigm for intonation conversion has been widely used for different languages using different intonation units. In this paper, an emotion conversion system is proposed for expressive Arabic speech. This system combines the transformation of both spectral and prosodic (pitch, duration, and energy) parameters of speech based on the linguistic context. Unit selection is used for pitch conversion and the effect of using different intonation units and different pitch detectors is studied. We also study the effect of converting each speech parameter, using our proposed system, on different expressions. Subjective tests were carried out to evaluate the system on three target expressions: sadness, happiness and questioning. Results show the effectiveness of both syllable and word units as the basic intonation unit for pitch conversion, however using syllables gives higher expressiveness for sadness and happiness. Results also show that converting pitch contours using our system is dominant for the happiness and questioning and highly affects the sadness, while duration conversion affects only sadness, spectral conversion affects only happiness, and decreasing the energy level adds more expressiveness to sadness. Finally, the evaluation of the overall system for emotion conversion shows that the proposed system managed to add an acceptable expressiveness in Arabic speech with a good quality for sadness and happiness. The same results can be obtained for questioning if only the pitch contour is converted, since spectral conversion degrades the output quality without increasing the expressiveness and duration conversion has no effect on questioning.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles