Abstract
The goal of document representation is to capture certain feature of the document. Many existing document representation methods are based on bag-of-words and ignore semantic relevance between words in the document. There we proposed a semantic smoothed topic model to represent document. It takes semantic similarity into consideration for topic of document. We conducted two experiments utilizing this method for text classification and information retrieval task. The experimental results suggest that our method is useful for capturing the semantic of text to alleviating polysemy and synonyms problem and data sparseness problem.