Affiliation:
1. Hewlett-Packard Laboratories, Palo Alto, CA
Abstract
Spam, also known as Unsolicited Commercial Email (UCE), is the bane of email communication. Many data mining researchers have addressed the problem of detecting spam, generally by treating it as a static text classification problem. True
in vivo
spam filtering has characteristics that make it a rich and challenging domain for data mining. Indeed, real-world datasets with these characteristics are typically difficult to acquire and to share. This paper demonstrates some of these characteristics and argues that researchers should pursue
in vivo
spam filtering as an accessible domain for investigating them.
Publisher
Association for Computing Machinery (ACM)
Reference32 articles.
1. An experimental comparison of naive Bayesian and keyword-based anti-spam filtering with personal e-mail messages
2. CAUBE. AU. Spam volume statistics. Web page: http://www.caube.org.au/spamstats.html 2002. CAUBE. AU. Spam volume statistics. Web page: http://www.caube.org.au/spamstats.html 2002.
Cited by
47 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献