Abstract
Microarrays have the capacity to measure the expressions of thousands of genes in parallel over many experimental samples. The unsupervised technique of bicluster analysis has been employed previously to uncover gene expression correlations over subsets of samples with the aim of modelling the natural gene functional classes. However the bicluster model also has the potential to shed light on the functions of unannotated open reading frames (ORFs). This aspect of biclustering has been under-explored. In this work we illustrate how the bicluster representation of expression data may be extended to enable putative functional classification of unannotated ORFs. We develop an ORF annotation approach, referred to as BALBOA, in which classifiers are constructed from the class specific expression patterns discovered by bicluster analysis. We demonstrate the efficacy of this approach via cross validation and carry out a comparative evaluation with kNN classification across three yeast expression datasets. Finally, we assign putative functions to unannotated ORFs and attempt to corroborate the best supported annotations with external experimental and protein sequence information.