0


用于电子邮件活动检测的贝叶斯聚类

Bayesian Clustering for Email Campaign Detection
课程网址: http://videolectures.net/icml09_haider_bcecd/  
主讲教师: Peter Haider
开课单位: 波茨坦大学
开课时间: 2009-08-26
课程语种: 英语
中文简介:
我们根据生成它们的源来讨论聚类元素的问题。对于以独立二进制属性为特征的元素,存在闭合形式的贝叶斯解。我们为依赖属性的情况推导出一种解决方案,该解决方案基于将实例转换为独立特征函数的空间。我们导出一个优化问题,产生映射到独立二进制特征向量的空间;这些功能可以反映输入空间中的任意依赖关系。此问题设置的动机是为电子邮件服务提供商应用垃圾邮件过滤。垃圾邮件陷阱提供已知为垃圾邮件的实时邮件流。如果可以可靠地识别同一广告系列的元素,则可以包含整个垃圾邮件和网络钓鱼活动。我们提出了一个案例研究,用于评估此应用程序的贝叶斯聚类。
课程简介: We discuss the problem of clustering elements according to the sources that have generated them. For elements that are characterized by independent binary attributes, a closed-form Bayesian solution exists. We derive a solution for the case of dependent attributes that is based on a transformation of the instances into a space of independent feature functions. We derive an optimization problem that produces a mapping into a space of independent binary feature vectors; the features can reflect arbitrary dependencies in the input space. This problem setting is motivated by the application of spam filtering for email service providers. Spam traps deliver a real-time stream of messages known to be spam. If elements of the same campaign can be recognized reliably, entire spam and phishing campaigns can be contained. We present a case study that evaluates Bayesian clustering for this application.
关 键 词: 聚类元素; 独立二进制; 贝叶斯解
课程来源: 视频讲座网
最后编审: 2019-04-23:lxf
阅读次数: 14