0


时间序列小单元:数据挖掘的新开始

Time Series Shapelets: A New Primitive for Data Mining
课程网址: http://videolectures.net/kdd09_ye_tssnpdm/  
主讲教师: Lexiang Ye
开课单位: 加利福尼亚大学
开课时间: 2009-09-14
课程语种: 英语
中文简介:
在过去的十年里,时间序列的分类一直吸引着人们的极大兴趣。最近的经验证明,对于大多数时间序列问题,简单的最近邻算法是很难克服的。尽管这可能被认为是好消息,但考虑到实现最近邻算法的简单性,这会产生一些负面的后果。首先,最近邻算法要求存储和搜索整个数据集,从而导致时间和空间复杂性,限制了其适用性,尤其是在资源有限的传感器上。第二,除了分类的准确性,我们通常希望对数据有一些了解。在这项工作中,我们引入了一个新的时间序列原语,时间序列shapelets,它解决了这些限制。非正式地说,shapelets是时间序列的子序列,在某种意义上最大限度地代表一个类。正如我们将在不同领域进行广泛的实证评估所表明的那样,基于时间序列shapelet原语的算法比最先进的分类器更易于解释、更准确和显著更快。
课程简介: Classification of time series has been attracting great interest over the past decade. Recent empirical evidence has strongly suggested that the simple nearest neighbor algorithm is very difficult to beat for most time series problems. While this may be considered good news, given the simplicity of implementing the nearest neighbor algorithm, there are some negative consequences of this. First, the nearest neighbor algorithm requires storing and searching the entire dataset, resulting in a time and space complexity that limits its applicability, especially on resource-limited sensors. Second, beyond mere classification accuracy, we often wish to gain some insight into the data. In this work we introduce a new time series primitive, time series shapelets, which addresses these limitations. Informally, shapelets are time series subsequences which are in some sense maximally representative of a class. As we shall show with extensive empirical evaluations in diverse domains, algorithms based on the time series shapelet primitives can be interpretable, more accurate and significantly faster than state-of-the-art classifiers.
关 键 词: 时间序列分类; 最近邻算法; 图元算法; 分类精度
课程来源: 视频讲座网
最后编审: 2020-06-06:王勇彬(课程编辑志愿者)
阅读次数: 130