Abstract
Network embedding is to learn effective low-dimensional vector representations for nodes in a network and has attracted considerable attention in recent years. To date, existing methods mainly focus on network structure information and cannot leverage abundant label information, which is potentially valuable in learning better vector representations. Due to the noise and incompleteness of label information, it is intractable to integrate label information into the vector representations in a partially labeled network. To address this issue, we investigate the effects of label information based on the label homophily. Briefly, label homophily can not only drive nodes sharing similar labels to be connected to each other, but also produce a division of a network into densely-connected, homogeneous parts that are weakly connected to each other. Furthermore, we propose a novel Label Homophily Oriented Network Embedding (LHONE) model to make the best of label homophily by converting a partially labeled network to two bipartite networks, and learning vector representations combined with a Gaussian mixture model (GMM). Extensive experiments on two real-world networks demonstrate the effectiveness of LHONE compared to state-of-the-art network embedding approaches.