2016 IEEE International Conference on Multimedia & Expo Workshops (ICMEW)
Download PDF

Abstract

In this paper, we propose to use deep neural network (DNN) as an effective tool for audio feature extraction. The DNN-derived features can be effectively used in a subsequent classifier (e.g., an SVM in this study) for audio classification. Specifically, we learn bottleneck features from a multi-layer perceptron (MLP), in which Mel filter bank feature is used as network input and one of the hidden layers has a small number of hidden units, compared to the size of the other hidden layers. The narrow hidden layer is served as a bottleneck layer, which creates a constriction in the network that forces the information pertinent to classification into a compact feature representation. We study both unsupervised and supervised bottleneck feature extraction methods and demonstrate that the supervised bottleneck features outperform conventional hand-crafted features and achieve the state-of-the-art performance in audio classification.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles