2022 26th International Conference on Pattern Recognition (ICPR)
Download PDF

Abstract

Standard discriminative classifiers can be upgraded to joint energy-based models (JEMs) by combining the classification loss with a log-evidence loss. Hence, such models intrinsically allow detection of out-of-distribution (OOD) samples, and empirically also provide better calibrated posteriors, i.e. prediction uncertainties. However, the training procedure suggested for JEMs (using stochastic gradient Langevin dynamics—or SGLD— to maximize the evidence) is reported to be brittle. In this work we propose to utilize score matching—in particular sliced score matching—to obtain a stable training method for JEMs. We observe empirically that the combination of score matching with the standard classification loss leads to improved OOD detection and better calibrated classifiers for otherwise identical DNN architectures. Additionally, we also analyze the impact of replacing the regular soft-max layer for classification with a gated soft-max one in order to improve the intrinsic transformation invariance and generalization ability.1
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles