
Bayesian Clustering for Email Campaign Detection
课程网址: http://videolectures.net/icml09_haider_bcecd/  
主讲教师: Peter Haider
开课单位: 波茨坦大学
开课时间: 2009-08-26
课程语种: 英语
课程简介: We discuss the problem of clustering elements according to the sources that have generated them. For elements that are characterized by independent binary attributes, a closed-form Bayesian solution exists. We derive a solution for the case of dependent attributes that is based on a transformation of the instances into a space of independent feature functions. We derive an optimization problem that produces a mapping into a space of independent binary feature vectors; the features can reflect arbitrary dependencies in the input space. This problem setting is motivated by the application of spam filtering for email service providers. Spam traps deliver a real-time stream of messages known to be spam. If elements of the same campaign can be recognized reliably, entire spam and phishing campaigns can be contained. We present a case study that evaluates Bayesian clustering for this application.
关 键 词: 聚类元素; 独立二进制; 贝叶斯解
课程来源: 视频讲座网
最后编审: 2019-04-23:lxf
阅读次数: 24