Towards coherent natural language description of video streams

Muhammad Usman Ghani Khan; Lei Zhang; Yoshihiko Gotoh

doi:10.1109/ICCVW.2011.6130306

2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops)

Towards coherent natural language description of video streams

Year: 2011, Pages: 664-671

DOI Bookmark: 10.1109/ICCVW.2011.6130306

Authors

Muhammad Usman Ghani Khan, University of Sheffield, United Kingdom
Lei Zhang, Harbin Engineering University, China
Yoshihiko Gotoh, University of Sheffield, United Kingdom

Abstract

This contribution addresses the approach to creating smooth and coherent description of video streams. Firstly conventional image processing techniques are applied to extract high level features from individual video frames. Natural language description of the frame contents is produced based on high level features. In order to extend the approach to description of video streams, we introduce units of features and overview how units can be used to present coherent, smooth and well phrased descriptions by incorporating spatial and temporal information. The approach is evaluated by calculating overlap similarity score between human authored and machine generated descriptions.

Like what you’re reading?

Already a member?

Get this article FREE with a new membership!

Person Search with Natural Language Description
2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Video Event Understanding Using Natural Language Descriptions
2013 IEEE International Conference on Computer Vision (ICCV)
Video scene classification based on natural language description
2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops)
TGIF: A New Dataset and Benchmark on Animated GIF Description
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Human Focused Video Description
2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops)
Complex Event Detection in Video Streams
2016 IEEE Symposium on Service-Oriented System Engineering (SOSE)
Localizing Moments in Video with Natural Language
2017 IEEE International Conference on Computer Vision (ICCV)
Natural Language Video Moment Localization Through Query-Controlled Temporal Convolution
2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
PoseScript: Linking 3D Human Poses and Natural Language
IEEE Transactions on Pattern Analysis & Machine Intelligence
Real-time Visual Object Tracking with Natural Language Description
2020 IEEE Winter Conference on Applications of Computer Vision (WACV)

Towards coherent natural language description of video streams

Authors

Abstract

Related Articles