Abstract
Several image classification problems are handled using a classical statistical pattern recognition methodology: image segmentation, visual feature extraction, classification. The accuracy of the solution is typically measured by comparing automatic results with manual classification ones, where the distinction between these three steps is not clear at all. In this paper we will focus on one of these steps by addressing the following question: does the visual relevance exploited by segmentation algorithms reflect the semantic relevance of the manual annotation performed by the user? For this purpose we chose a gastroenterology scenario where clinicians classified a set of images into three different types (cancer, pre-cancer, normal), and manually segmented the area they believe was responsible for this classification. Afterwards, we have quantified the performance of two popular segmentation algorithms (mean shift, normalized cuts) on how well they produced one image patch that approximates manual annotation. Results showed that, for this case study, this resemblance is quite close for a large percentage of the images when using normalized cuts.