0


加密互联网流量流中有效实时的应用内活动分析

Effective and Real-time In-App Activity Analysis in Encrypted Internet Traffic Streams
课程网址: http://videolectures.net/kdd2017_xiong_internet_traffic_streams/  
主讲教师: Hui Xiong
开课单位: 新泽西州立大学
开课时间: 2017-10-09
课程语种: 英语
中文简介:
移动应用内服务分析旨在将移动互联网流量分类为不同类型的服务用途,由于应用内服务越来越多地采用安全协议,这已成为移动服务提供商面临的一项具有挑战性和紧迫性的任务。虽然已经在移动互联网流量的分类方面做出了一些努力,但现有的方法涉及复杂的特征构建和大的存储缓存,导致处理速度低,因此不适用于在线实时场景。为此,我们开发了一个迭代分析器,用于实时对加密的移动流量进行分类。具体而言,我们首先通过一种新的最大化内部活动相似性和最小化不同活动相似性(MIMD)测量,从从流量分组序列中提取的原始特征中选择一组最具鉴别性的最优特征。 为了开发在线分析器,我们首先表示具有一系列时间窗口的流量,其中每个时间窗口由最优特征向量描述,并在数据包级别迭代更新。不是从一系列原始流量包中提取特征元素,而是在观察到新的流量包时更新我们的特征元素,并且不需要存储原始流量包。 由相同的服务使用活动生成的时间窗口通过我们提出的方法进行分组,即递归时间连续性约束的KMeans聚类(rKC)。然后将聚类中心的特征向量馈送到随机森林分类器中,以识别相应的服务使用。最后,我们对来自微信、Whatsapp和Facebook的真实世界流量数据进行了广泛的实验,以证明我们方法的有效性和效率。结果表明,所提出的分析器在真实场景中提供了高精度,并且具有低存储缓存需求和快速处理速度。
课程简介: The mobile in-App service analysis, aiming at classifying mobile internet traffic into different types of service usages, has become a challenging and emergent task for mobile service providers due to the increasing adoption of secure protocols for in-App services. While some efforts have been made for the classification of mobile internet traffic, existing methods reply on complex feature construction and large storage cache, which lead to low processing speed, and thus not practical for online real-time scenarios. To this end, we develop an iterative analyzer for classifying encrypted mobile traffic in a real-time way. Specifically, we first select an optimal set of most discriminative features from raw features extracted from traffic packet sequences by a novel Maximizing Inner activity similarity and Minimizing Different activity similarity (MIMD) measurement. To develop the online analyzer, we first represent a traffic flow with a series of time windows, where each is described by the optimal feature vector and is updated iteratively at the packet level. Instead of extracting feature elements from a series of raw traffic packets, our feature elements are updated when a new traffic packet is observed and the storage of raw traffic packets is not required. The time windows generated from the same service usage activity are grouped by our proposed method, namely recursive time continuity constrained KMeans clustering (rCKC). The feature vectors of cluster centers are then fed into a random forest classifier to identify corresponding service usages. Finally, we provide extensive experiments on real-world traffic data from Wechat, Whatsapp and Facebook to demonstrate the effectiveness and efficiency of our approach. The results show that the proposed analyzer provides high accuracy in real-world scenarios, and has low storage cache requirement as well as fast processing speed.
关 键 词: 移动应用; 服务分析; 服务用途
课程来源: 视频讲座网
数据采集: 2023-06-01:chenxin01
最后编审: 2023-06-01:chenxin01
阅读次数: 21