0


挖掘复杂的动态数据

Mining complex dynamic data
课程网址: http://videolectures.net/ecmlpkdd2011_tutorial_mining/  
主讲教师: Myra Spiliopoulou, Irene Ntoutsi, Grigoris Tsoumakas, Arthur Zimek
开课单位: 塞萨洛尼基亚里士多德大学
开课时间: 2011-10-03
课程语种: 英语
中文简介:
近年来,许多应用需要从比传统数据(基础)记录更丰富的数据类型进行挖掘:社交网络的分析需要活动记录与内容(例如资源描述和用户记录)的组合;推荐引擎需要考虑用户评级,客户交易,项目描述和用户档案;医疗应用需要对患者进行不同类型的记录,包括疾病和药物的历史数据。同时,挖掘任务变得更加精细:数据是多方面的,并且遵循许多,正交或重叠的概念;数据累积或形成流;它们是动态的,并呼吁采用采矿模型。在本教程中,我们讨论了对复杂数据的挖掘,重点放在流媒体,动态数据上的学习和适应。我们考虑三类复杂数据:遵循多个重叠标签的数据,包含有趣子空间的高维数据和数据跨越多个表。对于每个类别,我们首先提供静态挖掘方法的全面概述,然后关注动态数据的方法和示例应用程序。对于多标签流数据,我们关注文档(新闻)分类的示例应用;核心方法是使用决策树进行流分类,预测和排序。对于高维流数据,我们关注生物信息学和网络入侵的示例应用;核心方法是流子空间聚类和异常检测。对于多关系流数据,我们考虑两个示例应用程序:动态社交网络分析和不断变化的客户数据分析;核心方法是基于张量的聚类,多关系聚类和分类。目标群体是:具有扎实的数据挖掘背景的研究生;致力于传统流挖掘并面临复杂数据应用的研究学者;拥有复杂和动态数据应用程序的从业者。
课程简介: In recent years, many applications require mining from richer data types than conventional data(base) records: the analysis of social networks requires the combination of activity recordings with content (e.g. resource descriptions and user records); recommendation engines require considering user ratings, customer transactions, item descriptions and user profiles; medical applications require the combination of different kinds of recordings on patients, including historical data on ailments and medication. At the same time, the mining tasks become more elaborate: the data are multi-faceted and adhere to many, orthogonal or overlapping concepts; the data accumulate or form streams; they are dynamic and call for adaptation of the mining models. In this tutorial, we discuss mining on complex data, putting the emphasis on learning and adaptation over streaming, dynamic data. We consider three categories of complex data: data that adhere to multiple overlapping labels, high-dimensional data that contain interesting subspaces, and data that span across multiple tables. For each category, we first provide a comprehensive overview of static mining methods, and then focus on methods and example applications for dynamic data. For multi-label stream data, we focus on the example application of document (news) categorization; the core methods are stream classification with decision trees, prediction and ranking. For high-dimensional stream data, we focus on the example application of bioinformatics and network intrusion; the core methods are stream subspace clustering and outlier detection. For multi-relational stream data, we consider two example applications: analysis of dynamic social networks, and analysis of evolving customer data; the core methods are tensor-based clustering, and multi-relational clustering and classification. The target groups are: postgraduate students with solid background in data mining; research scholars who work on conventional stream mining and are confronted with applications on complex data; practitioners that own applications on complex and dynamic data.
关 键 词: 数据类型挖掘; 用户评级; 客户交易
课程来源: 视频讲座网
最后编审: 2019-04-07:cwx
阅读次数: 16