2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII)
Download PDF

Abstract

Soundscape emotion recognition (SER) aims at the automatic recognition of emotions perceived in soundscape recordings. To benchmark SER, we propose a dataset of audio samples called Emo-Soundscapes and two evaluation protocols for machine learning models. We curated 600 soundscape recordings from Freesound.org and mixed 613 audio clips from a combination of these. The Emo-Soundscapes dataset contains 1213 6-second Creative Commons licensed audio clips. We collected the ground truth annotations of perceived emotion in these 1213 soundscape recordings using a crowdsourcing listening experiment, where 1182 annotators from 74 different countries rank the audio clips according to the perceived valence and arousal. This dataset allows studying SER and how the mixing of various soundscape recordings influences their perceived emotion. The dataset is at http://metacreation.net/emo-soundscapes/.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles