挖掘关系模型树Mining Relational Model Trees |
|
课程网址: | http://videolectures.net/solomon_appice_mrmt/ |
主讲教师: | Annalisa Appice |
开课单位: | 巴里大学 |
开课时间: | 2007-02-25 |
课程语种: | 英语 |
中文简介: | 多关系数据挖掘(MRDM)是指从分散在关系数据库的多个表中的数据中发现隐式,先前未知且可能有用的信息的过程。当要研究的分析单位的属性最终可能受不同类型的相关分析单位的属性潜在地影响并且自然建模以生成与对象类型数量一样多的表时,MRDM必须面对添加到数据挖掘任务中的大量复杂性。回归是MRDM中的一项基本任务,目标是检查具有已知连续答案(响应)的过去经验的样本,并通过归纳过程概括未来的案例。继MRDM研究的主流之后,SMOTI先生诉诸于结构化方法,以便对存储在紧密耦合的数据库中的数据进行递归分区,并建立一个多关系模型树,以捕获响应变量与两个或多个解释变量之间的线性相关性。参考对象和任务相关对象。通过在每个步骤中选择划分训练空间(拆分节点)或在线性模型中引入回归变量以与叶子相关联(回归节点),可以使模型树自上而下。内部回归节点有助于定义多个模型并捕获整体效果,而带叶子的直线回归仅捕获局部效果。与数据库的紧密耦合使得可以免费获得关于数据结构(例如,外键)的知识,以指导在多关系模式空间中的搜索。 |
课程简介: | Multi-Relational Data Mining (MRDM) refers to the process of discovering implicit, previously unknown and potentially useful information from data scattered in multiple tables of a relational database. MRDM is necessary to face the substantial complexity added to data mining tasks when properties of units of analysis to be investigated are potentially affected by attributes of related units of analysis eventually of different types and naturally modeled to yield as many tables as the number of object types. Regression is a fundamental task in MRDM where the goal is to examine samples of past experience with known continuous answers (response) and generalize future cases throughan inductive process. Following the mainstream of MRDM research, Mr-SMOTI resorts to the structural approach in order to recursively partition data stored in a tightly-coupled database and build a multi-relational model tree that captures the linear dependence between the response variable and one or more explanatory variables of both the reference objects and task-relevant objects. The model tree is top-down induced by choosing, at each step, either to partition the training space (split nodes) or to introduce a regression variable in the linear models to be associated with the leaves (regression nodes). Internal regression nodes contribute to the definition of multiple models and capture global effects, while straight-line regressions with leaves capture only local effects. The tight-coupling with the database makes the knowledge on data structures (e.g., foreign keys) available free of charge to guide the search in the multi-relational pattern space. |
关 键 词: | 数据挖掘; 自然建模; 连续答案 |
课程来源: | 视频讲座网 |
最后编审: | 2019-09-21:cwx |
阅读次数: | 37 |