2008 IEEE International Conference on Multimedia and Expo (ICME)
Download PDF

Abstract

Vocal part detection, which plays an important role in music information retrieval, is still a tough task so far. Previous works focused on short time features, which cannot capture some essential long term characteristics of singing. In this paper, we propose a Dynamic Time Warping based unsupervised segmentation algorithm to divide a pop song into homogeneous segments, which contain either vocal or pure music sound. This procedure makes it possible to design long term feature or classification schema to improve the accuracy of vocal part detection. We also present a segment level classification schema based on the result of segmentation. It will be shown that the classification accuracy is significantly improved.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles