2018 24th International Conference on Pattern Recognition (ICPR)
Download PDF

Abstract

The emergence of user-operated media motivates the explosive growth of online videos. Browsing these large amounts of videos is time-consuming and tedious, which makes finding the moments of user major or special preference (i.e. highlights extraction) becomes an urgent problem. Moreover, the user subjectivity over a video makes no fixed extraction meets all user preferences. This paper addresses these problems by posing a query-related highlight extraction framework which optimizes selected frames to both semantically query-related and visually representative of the entire video. Under this framework, relevance between the query text and the video frames is first computed on a visual-semantic feature embedding space induced by a convolutional neural network (Query-Inception network). Then we enforce the diversity on the video frames with the determinantal point process (DPP), a recently introduced probabilistic model for diverse subset selection. The experimental results show that our query-related highlight extraction method is particularly useful for news videos content fetching, e.g. showing the abstraction of the entire video while playing focus on the parts that matches the user queries.
Like what you’re reading?
Already a member?Sign In
Member Price
$11
Non-Member Price
$21
Add to CartSign In
Get this article FREE with a new membership!

Related Articles