2016 IEEE International Conference on Big Data (Big Data)
Download PDF

Abstract

Nowadays the explosion of Web information has led to the boom of massive web documents such as news webpages, online literature, etc. The latent topics behind the documents spread by self-evolution and mutual transition. Understanding how topics in documents evolve and transit is an important and challenging problem. Topic model is a set of powerful toolkits to model documents generation to find their underlying topics, usually at the unigram level, making it difficult to model the relationship between terms and their underlying topics. In this paper, we propose a pairwise topic modeling method to incorporate a pairwise relationship into topic modeling methods. We manage to discover latent topics as well as topic transitions at the same time in a natural way. We show that the pairwise topic model can facilitate discovering of individual topics as well as topic evolution. The results indicate our proposed method leads to a significant performance improvement over the traditional topic modeling methods, such as Latent Dirichlet Allocation (LDA) in terms of language perplexity. Besides, we conduct a series of empirical studies to show the topic words and topic transitions discovered. From the case studies, we show that with the help of PTM methods, people are able to explicitly understand how topics evolve and transit between each other.
Like what you’re reading?
Already a member?Sign In
Member Price
$11
Non-Member Price
$21
Add to CartSign In
Get this article FREE with a new membership!

Related Articles