2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
Download PDF

Abstract

In this work, we focus on the problem of learning a classification model that performs inference on patient Electronic Health Records (EHRs). Often, a large amount of costly expert supervision is required to learn such a model. To reduce this cost, we obtain auxiliary confidence labels that indicate how sure an expert is in the class labels she provides. If this additional confidence information can be incorporated into a classifier, than the number of labeled patient instances required to learn an accurate model may be reduced. To this end, we develop a novel metric learning method called Confidence bAsed MEtric Learning (CAMEL) that not only supports inclusion of confidence labels, but specifically emphasizes model interpretability in three ways. First, CAMEL produces metrics that use only the EHR features relevant to the task, omitting those that are not. Second, CAMEL naturally produces confidence scores that can be considered when making treatment decisions. Third, because it is a metric, CAMEL allows for insightful comparisons to be made, such as finding the past patients are most similar to a new patient. In our experimental evaluation, we show that CAMEL can use confidence labels to learn models as accurate as current classification methods while using only 10% of the training instances. Finally, we perform qualitative assessments on the metrics learned by CAMEL and show that they identify and clearly articulate important factors in how the model performs inference.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles