|
Published Articles >> Table of Contents >> Abstract
2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06)
pp. 195-201
Personalized Spam Filtering with Semi-supervised Classifier Ensemble
Victor Cheng, Hong Kong Baptist University, Hong Kong
C.H. Li, Hong Kong Baptist University, Hong Kong
Full Article Text:

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/WI.2006.132
Send link to a friend
| Abstract |
|
The proliferation of unsolicited emails, also
known as spam, poses significant burden to email users
worldwide. Recent researches on spam filtering have
shown that high accuracies can be obtained if labeled
emails examples are available from the particular user
of the spam filter. However, the time consuming process
of providing personalized labeled training examples is
often inconvenient or impossible due to privacy issues.
In this paper, a semi-supervised personalized spam
filter based on classifier ensemble is proposed that
classifies users emails accurately by learning on both
generic labeled emails and personalized unlabeled
emails. The proposed multi-stage classification process
begins learning a SVM model from labeled generic
data. Unlabeled users emails are then fed to this SVM
to generate personalized labeled data for constructing
personalized naive Bayes classifiers. Furthermore,
some personalized labeled examples are generated by
exploiting rare word distributions and then fed into a
semi-supervised classifier. The multi-stage results are
integrated with SVMs learned from generic labeled
emails to produce the final classification results.
Experimental results show that the proposed
approaches can significantly increases the
classification accuracy in spam filtering.
|
Additional Information
|
Citation:
Victor Cheng, C.H. Li,
"Personalized Spam Filtering with Semi-supervised Classifier Ensemble,"
wi,
pp. 195-201,
2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06),
2006
|
|