网络与回归预测树聚类Network Regression with Predictive Clustering Trees |
|
课程网址: | http://videolectures.net/ecmlpkdd2011_stojanova_ceci_regression/ |
主讲教师: | Daniela Stojanova, Michelangelo Ceci |
开课单位: | 巴里大学 |
开课时间: | 2011-11-30 |
课程语种: | 英语 |
中文简介: | 网络数据的回归推理是机器学习和数据挖掘中一项具有挑战性的任务。网络数据描述由节点表示的实体, 这些实体可以通过边缘相互连接 (相关)。许多网络数据集的特征是一种自动关联形式, 其中给定节点上的响应变量的值取决于连接到给定节点的节点上的变量值 (预测值和响应)。这种现象直接违反了独立 (i. id. d.) 观察的假设: 同时, 它提供了一个独特的机会来提高网络数据预测模型的性能, 因为关于一个实体的推断可以用来改进相关实体的推断。本文提出了一种在利用网络数据建立回归模型时, 明确考虑自相关的数据挖掘方法。该方法基于预测聚类树 (pct) 的概念, 可用于聚类和预测任务: pct 是被视为聚类层次结构的决策树, 并提供了聚类的符号描述。此外, pct 还可用于多目标预测问题, 包括多目标回归和多目标分类。对网络回归现实世界问题的实证结果表明, 当数据中存在自动相关时, pct 的拟议扩展比传统的决策树归纳表现得更好。 |
课程简介: | Regression inference in network data is a challenging task in machine learning and data mining. Network data describe entities represented by nodes, which may be connected with (related to) each other by edges. Many network data sets are characterized by a form of auto correlation where the values of the response variable at a given node depend on the values of the variables (predictor and response) at the nodes connected to the given node. This phenomenon is a direct violation of the assumption of independent (i.i.d.) observations: At the same time, it offers a unique opportunity to improve the performance of predictive models on network data, as inferences about one entity can be used to improve inferences about related entities. In this paper, we propose a data mining method that explicitly considers auto correlation when building regression models from network data. The method is based on the concept of predictive clustering trees (PCTs), which can be used both for clustering and predictive tasks: PCTs are decision trees viewed as hierarchies of clusters and provide symbolic descriptions of the clusters. In addition, PCTs can be used for multi-objective prediction problems, including multi-target regression and multi-target classification. Empirical results on real world problems of network regression show that the proposed extension of PCTs performs better than traditional decision tree induction when auto correlation is present in the data. |
关 键 词: | 机器学习; 回归; 网络分析; 集群 |
课程来源: | 视频讲座网 |
最后编审: | 2020-06-13:邬启凡(课程编辑志愿者) |
阅读次数: | 97 |