Interpreting Camera Operations in the Context of Content-based Video Indexing and Retrieval
In this work, we intend to go one step further to overcome the difficulty that lies in the gap between low-level media features (e.g. colors, texture, motion, etc.) and high-level concepts to perform a reliable content-based indexing and retrieval. More especially, our work proposes a new way to establish a connection between both geometric and radiometric deformations and the characterization of them in terms of camera operations. Based on both the apparent motion and the defocus blur (low-level features), we estimate extrinsic and intrinsic camera parameter changes, and then deduce 3D camera operations (i.e. mid-level features), such as panning/tracking, tilting/booming, zooming/ dollying and rolling, as well as focus changes. Finally, camera operations are recorded into an index which is then used for video retrieval. Experiments confirm that the proposed mid-level features can be accurately deduced from low-level features and that they can be used for indexing and retrieval purpose.
Index Terms:
Content-based video retrieval, camera operations, apparent motion, defocus blur, camera motion, focus changes
Citation:
Wei Pan, Francois Deschenes, "Interpreting Camera Operations in the Context of Content-based Video Indexing and Retrieval," crv,pp.7, The 3rd Canadian Conference on Computer and Robot Vision (CRV'06), 2006