0


识别可疑URL:大规模在线学习的应用

Identifying Suspicious URLs: An Application of Large-Scale Online Learning
课程网址: http://videolectures.net/icml09_ma_isu/  
主讲教师: Justin Ma
开课单位: 加州大学
开课时间: 2009-09-26
课程语种: 英语
中文简介:
本文探讨了使用相关URL的词汇和基于主机的功能来检测恶意网站(涉及犯罪诈骗的网站)的在线学习方法。我们证明该应用程序特别适用于在线算法,因为训练数据的大小大于可以批量有效处理的大小,并且因为代表恶意URL的特征的分布正在不断变化。使用我们开发的用于收集URL功能的实时系统,结合来自大型Web邮件提供商的标记URL的实时来源,我们证明最近开发的在线算法可以与批处理技术一样准确,实现高达99%的分类准确度在平衡的数据集上。
课程简介: This paper explores online learning approaches for detecting malicious Web sites (those involved in criminal scams) using lexical and host-based features of the associated URLs. We show that this application is particularly appropriate for online algorithms as the size of the training data is larger than can be efficiently processed in batch and because the distribution of features that typify malicious URLs is changing continuously. Using a real-time system we developed for gathering URL features, combined with a real-time source of labeled URLs from a large Web mail provider, we demonstrate that recently-developed online algorithms can be as accurate as batch techniques, achieving classification accuracies up to 99% over a balanced data set.
关 键 词: 检测恶意网站; 在线学习; 在线算法
课程来源: 视频讲座网
最后编审: 2020-06-12:章泽平(课程编辑志愿者)
阅读次数: 78