0


在线发现和维护时间序列图案

Online Discovery and Maintenance of Time Series Motifs
课程网址: http://videolectures.net/kdd2010_mueen_odmt/  
主讲教师: Abdullah Al Mueen
开课单位: 新墨西哥大学
开课时间: 2010-10-01
课程语种: 英语
中文简介:
重复子序列,时间序列图案的检测是一个问题,已被证明对几个更高级别的数据挖掘算法具有很大的实用性,包括分类,聚类,分割,预测和规则发现。近年来,在静态离线数据库中有效地发现这些图案已经花费了大量的研究工作。然而,对于许多领域,时间序列的固有流媒体性质要求在线发现和维护时间序列图案。在本文中,我们开发了第一个在线主题发现算法,该算法在流的最新历史中准确地实时监视和维护主题。我们的算法具有最坏情况的更新时间,其与窗口大小成线性关系并且可扩展以维持更复杂的模式结构。相反,当前的离线算法要么需要大量的更新时间,要么需要非常昂贵的预处理步骤,而在线算法根本无法承受。我们的核心思想允许我们的算法的有用扩展,以处理任意数据速率和发现多维图案。我们通过机器人,声学监测和在线压缩领域的各种案例研究证明了我们的算法的实用性。
课程简介: The detection of repeated subsequences, time series motifs, is a problem which has been shown to have great utility for several higher-level data mining algorithms, including classification, clustering, segmentation, forecasting, and rule discovery. In recent years there has been significant research effort spent on efficiently discovering these motifs in static offline databases. However, for many domains, the inherent streaming nature of time series demands online discovery and maintenance of time series motifs. In this paper, we develop the first online motif discovery algorithm which monitors and maintains motifs exactly in real time over the most recent history of a stream. Our algorithm has a worst-case update time which is linear to the window size and is extendible to maintain more complex pattern structures. In contrast, the current offline algorithms either need significant update time or require very costly pre-processing steps which online algorithms simply cannot afford. Our core ideas allow useful extensions of our algorithm to deal with arbitrary data rates and discovering multidimensional motifs. We demonstrate the utility of our algorithms with a variety of case studies in the domains of robotics, acoustic monitoring and online compression.
关 键 词: 重复子序列; 静态离线数据库; 多维图案
课程来源: 视频讲座网
最后编审: 2019-05-11:cjy
阅读次数: 60