2015 IEEE International Conference on Multimedia and Expo (ICME)
Download PDF

Abstract

Sequential Monte Carlo probability hypothesis density (SMC-PHD) filter has received much interest in the field of nonlinear non-Gaussian visual tracking due to its ability to handle a variable number of speakers. The SMC-PHD filter employs surviving, spawned and born particles to model the state of the speakers and jointly estimates the variable number of speakers with their states. The born particles play a critical role in the detection of new speakers, which makes it necessary to propagate them in each frame. However, this increases the computational cost of the visual tracker. Here, we propose to use audio data to determine when to propagate the born particles and re-allocate the surviving and spawned particles. In our framework, we employ audio data as an aid to visual SMC-PHD (V-SMC-PHD) filter by using the direction of arrival (DOA) angles of the audio sources to reshape the distribution of the particles. Experimental results on the AV16:3 dataset with multi-speaker sequences show that our proposed audio-visual SMC-PHD (AV-SMC-PHD) filter improves the tracking performance in terms of estimation accuracy and computational efficiency.
Like what you’re reading?
Already a member?Sign In
Member Price
$11
Non-Member Price
$21
Add to CartSign In
Get this article FREE with a new membership!

Related Articles