2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops)
Download PDF

Abstract

This paper addresses the problem of video scene classification based on the small amount of natural language description created for the video stream. The approach incorporates a conventional tf·idf term-document matrix with scene class specific information derived using the maximum a posteriori (MAP) estimates and the chi-square statistic. Further latent semantic analysis (LSA) is applied to find co-occurrence terms between documents. The experiment adopts the k-nearest neighbour (kNN) and the support vector machine (SVM) classifiers to evaluate the effectiveness of scene class information and co-occurrence terms. They achieved 83.86% (kNN) and 98.11% (SVM) when the MAP estimates and the chi-square statistic were combined with the tf·idf term-document matrix, followed by LSA approximation.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles