ICDAR 2015 competition HTRtS: Handwritten Text Recognition on the tranScriptorium dataset

Joan Andreu Sanchez; Alejandro H. Toselli; Veronica Romero; Enrique Vidal

doi:10.1109/ICDAR.2015.7333944

Abstract

This paper describes the second edition of the Handwritten Text Recognition (HTR) contest on the tranScriptorium datasets that has been held in the context of the International Conference on Document Analysis and Recognition 2015. Two tracks with different conditions on the use of training data were proposed. Nine research groups registered in the contest but finally three research submitted results. The handwritten images for this contest were drawn from the English “Bentham collection” dataset used in the tranScriptorium project. A small subset of this collection has been chosen for the present HTR competition. The selected subset has been written by several hands and entails significant variabilities and difficulties regarding the quality of text images, writing styles and crossed-out text. This contest is clearly more difficult than the the first edition both for training and for testing. A portion of the training dataset and the full test dataset were provided in the form of carefully segmented line images, along with the corresponding transcripts. Another portion of the training dataset was provided as raw images and their corresponding transcripts at region level. The three participants achieved good results, with transcription word error rates ranging from 31% down to 44%.

ICDAR 2015 competition HTRtS: Handwritten Text Recognition on the tranScriptorium dataset

Authors

Abstract

Related Articles