0


在概念漂移高维轨迹的在线聚类

Online Clustering of High-Dimensional Trajectories under Concept Drift
课程网址: http://videolectures.net/ecmlpkdd2011_krempl_trajectories/  
主讲教师: Georg Krempl
开课单位: 布拉格捷克技术大学
开课时间: 2011-11-30
课程语种: 英语
中文简介:
历史交易数据是在许多应用程序中收集的, 例如医生记录的患者病史和公司收集的客户交易。一个重要的问题是学习模型的主要对象 (病人, 客户), 而不是交易, 特别是当这些模型受到漂移。我们通过将多变量数据在线聚类的研究与轨迹挖掘范式相结合来解决这一问题。我们将每个主对象 (例如其事务) 的测量建模为高维要素空间中的轨迹, 以不规则的时间间隔进行测量。然后, 我们将具有相似轨迹的个人进行分组, 以确定类似演变的子群体, 例如类似演变的客户群体或职业相似的员工群体。我们假设多元轨迹是由漂移高斯混合模型生成的。我们研究 (i) 一种基于 ems 的方法, 将这些轨迹增量地聚类作为一种参考方法, 可以访问所有的数据进行学习, 并提出 (ii) 一种基于卡尔曼滤波的在线算法, 该算法可以有效地跟踪高斯的轨迹集群。结果表明, 与基于 ems 的方法相比, 两种方法都能很好地逼近参考, 而基于卡尔曼滤波的算法速度快一个数量级。
课程简介: Historical transaction data are collected in many applications, e.g., patient histories recorded by physicians and customer transactions collected by companies. An important question is the learning of models upon the primary objects (patients, customers) rather than the transactions, especially when these models are subjected to drift. We address this problem by combining advances of online clustering on multivariate data with the trajectory mining paradigm. We model the measurements of each individual primary object (e.g. its transactions), taken at irregular time intervals, as a trajectory in a high-dimensional feature space. Then, we cluster individuals with similar trajectories to identify sub-populations that evolve similarly, e.g. groups of customers that evolve similarly or groups of employees that have similar careers. We assume that the multivariate trajectories are generated by drifting Gaussian Mixture Models. We study (i) an EM-based approach that clusters these trajectories incrementally as a reference method that has access to all the data for learning, and propose (ii) an online algorithm based on a Kalman filter that efficiently tracks the trajectories of Gaussian clusters. We show that while both methods approximate the reference well, the algorithm based on a Kalman filter is faster by one order of magnitude compared to the EM-based approach.
关 键 词: 计算机科学; 数据挖掘; 时态和流采矿
课程来源: 视频讲座网
最后编审: 2020-06-06:魏雪琼(课程编辑志愿者)
阅读次数: 65