Abstract
Traditional Chinese Medicine (TCM) is a holistic integrative medical approach. Exploring the relations between the herbal formulae and the symptoms is a crucial problem in researches of TCM. Unlike existing researches, we treat it as a both multi-instance learning and multi-label learning problem. In this paper, we propose a novel approach, which named Weighted Sampling based on Similar Herbs MIML (WSSH-MIML), to predict one formula's primary symptoms based on multi-instance multi-label framework. We compare the performance of our model with other three state-of-the-art multi-label learning algorithms and the experimental results indicate ours is superior. This study suggests that the MIML technique provides a new research paradigm for mining meaningful TCM information.