2014 IEEE Winter Conference on Applications of Computer Vision (WACV)
Download PDF

Abstract

We propose an interactive video segmentation system built on the basis of occlusion and long term spatio-temporal structure cues. User supervision is incorporated in a superpixel graph clustering framework that differs crucially from prior art in that it modifies the graph according to the output of an occlusion boundary detector. Working with long temporal intervals (up to 100 frames) enables our system to significantly reduce annotation effort with respect to state of the art systems. Even though the segmentation results are less than perfect, they are obtained efficiently and can be used in weakly supervised learning from video or for video content description. We do not rely on a discriminative object appearance model and allow extracting multiple foreground objects together, saving user time if more than one object is present. Additional experiments with unsupervised clustering based on occlusion boundaries demonstrate the importance of this cue for video segmentation and thus validate our system design.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles