2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
Download PDF

Abstract

While the action recognition task on videos collected from visible spectrum imaging has received much attention, action recognition in infrared (IR) videos is significantly less explored. Our objective is to exploit imaging data in this modality for the action recognition task. In this work, we propose a novel two-stream 3D convolutional neural network architecture by introducing the discriminative code layer and the corresponding discriminative code loss function. The proposed network processes IR images and the IR-based optical flow field sequences. We pretrain the 3D CNN model on the visible spectrum Sports-1M action dataset and finetune it on the Infrared Action Recognition (InfAR) dataset. We conduct an elaborate analysis of different fusion schemes (weighted average, single and double-layer neural nets) applied to different 3D CNN outputs. Experimental results demonstrate that our approach can achieve state-of-the-art average precision performances on the InfAR dataset.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles