用于电子邮件活动检测的贝叶斯聚类Bayesian Clustering for Email Campaign Detection |
|
课程网址: | http://videolectures.net/icml09_haider_bcecd/ |
主讲教师: | Peter Haider |
开课单位: | 波茨坦大学 |
开课时间: | 2009-08-26 |
课程语种: | 英语 |
中文简介: | 我们根据生成它们的源来讨论聚类元素的问题。对于以独立二进制属性为特征的元素,存在闭合形式的贝叶斯解。我们为依赖属性的情况推导出一种解决方案,该解决方案基于将实例转换为独立特征函数的空间。我们导出一个优化问题,产生映射到独立二进制特征向量的空间;这些功能可以反映输入空间中的任意依赖关系。此问题设置的动机是为电子邮件服务提供商应用垃圾邮件过滤。垃圾邮件陷阱提供已知为垃圾邮件的实时邮件流。如果可以可靠地识别同一广告系列的元素,则可以包含整个垃圾邮件和网络钓鱼活动。我们提出了一个案例研究,用于评估此应用程序的贝叶斯聚类。 |
课程简介: | We discuss the problem of clustering elements according to the sources that have generated them. For elements that are characterized by independent binary attributes, a closed-form Bayesian solution exists. We derive a solution for the case of dependent attributes that is based on a transformation of the instances into a space of independent feature functions. We derive an optimization problem that produces a mapping into a space of independent binary feature vectors; the features can reflect arbitrary dependencies in the input space. This problem setting is motivated by the application of spam filtering for email service providers. Spam traps deliver a real-time stream of messages known to be spam. If elements of the same campaign can be recognized reliably, entire spam and phishing campaigns can be contained. We present a case study that evaluates Bayesian clustering for this application. |
关 键 词: | 聚类元素; 独立二进制; 贝叶斯解 |
课程来源: | 视频讲座网 |
最后编审: | 2019-04-23:lxf |
阅读次数: | 19 |