Abstract
Recently Convolutional Neural Networks(CNNs) models have achieved remarkable results for fine-grained image classification. However, CNNs require a large amount of training data during supervised learning and labeling so much data is expensive in many cases. To address this issue, this paper innovatively presents a semi-supervised pipeline to improve fine-grained classification tasks without any extra data. We carefully combine CNNs with Generative Adversarial Nets(GANs) for classification, which shows that the result is affirmative. In addition, we propose a multi dimension label regularization(MDLR) method to train labeled images and unlabeled images simultaneously. First we use a pre-trained Yolo v2 object detection model to detect coarse-grained object on the original dataset. Second we feed cropped images to the generator of GAN to produce more generated data and assign a uniform label distribution to the generated images. Third we mix these origin real images and generated images. Then these mixed images are fed to a baseline CNN classifier and a feature-fused CNN classifier. We obtain competitive or state-of-the-art results: using feature-fused CNN model on Stanford Dogs dataset we set a new state-of-the-art result of 90.7%; on Oxford 102 Flowers dataset, we show consistent improvements over baseline.