Abstract
Protein-protein interactions play a crucial role in the cellular process. Although recent studies have elucidated a huge amount of protein-protein interactions within Saccharomyces cerevisiae, many still remain to be identified. This paper presents a new interaction prediction method that associates domains and other protein features by using Support Vector Machines (SVMs), and it reports the results of investigating the effect of those protein features on the prediction accuracy. Cross-validation tests revealed that the highest F-measure of 79%, was obtained by combining the features "domain," "amino acid composition," and "subcellular localization." These prediction results were more accurate than the predictions reported previously. Furthermore, predicting the interaction of unknown protein pairs revealed that high-scoring protein pairs tend to share similar GO annotations in the biological process hierarchy. This method can be applied across species.