0


将知识片段导入到支持CMS的数据挖掘的分析报告

Importing Knowledge Fragments to CMS-Enabled Data Mining Analytical Reports
课程网址: http://videolectures.net/eswc2010_hazucha_ikftc/  
主讲教师: Andrej Hazucha
开课单位: 经济大学
开课时间: 2010-06-30
课程语种: 英语
中文简介:
描述性数据挖掘只有在以一种令人满意的形式将结果提供给最终用户时才能带来成果。用于最终用户提供挖掘结果(以及相关信息,如数据模式、任务设置和域背景知识)的工具称为分析报告。为了管理大量涉及不同挖掘会话的报告,我们设计了一个基于内容管理系统的数据挖掘Web门户,合称为Seewbar-CMS。1 CMS的一个要求是能够与语义知识源和其他结构化数据交互,见[1]。在CMS中编写分析报告的数据分析师有不同的可能性(半)自动将结构化数据输入文本。首先,对于本地存储的数据,如从PMML(预测模型标记语言)挖掘工具中导出的挖掘任务/结果/数据描述,CMS插件可以选择使用XSLT从PMML生成的标记HTML代码段,并将它们插入到分析人员指示的报告中。第二,新增加了对远程数据/知识的复杂支持。此功能的基础结构允许持续指定指向这些资源的可查询资源模板查询的链接(最终用户可以在运行时对其进行参数化)–xslt转换,允许将查询结果作为HTML片段插入,无论是静态的还是从资源动态更新的。水源。目前,我们以原生XML数据库(Berkeley,通过XQuery查询)的形式对可查询资源进行了实验,该数据库以SPARQL端点和本体知识套件的形式存储PMML数据,以及语义知识库(Topic Maps工具,通过一种类似Prolog的语言ToLog查询)。正在纳入更多类型的资源,如Lucene指数。
课程简介: Descriptive data mining only brings its fruits when the results are provided to the end user in a palatable form. The vehicle for end-user delivery of mining results (and associated information such as data schema, task settings, and domain background knowledge) are so-called analytical reports. In order to manage a huge number of reports referring to different mining sessions, we designed a data mining web portal based on a content management system, together called SEWEBAR-CMS.1 One of the requirements on the CMS was the ability to interact with semantic knowledge sources and other structured data, see [1]. The data analyst who authors an analytical report in the CMS has different possibilities of (semi-)automatically entering structured data into the text. First, for locally stored data such as mining task/result/data descriptions exported from mining tools in PMML (Predictive Model Mark-Up Language), a CMS plugin can pick marked segments of HTML code, produced from PMML using XSLT, and insert them into the report as indicated by the analyst. Second, sophisticated support for remote data/knowledge has been newly added. The infrastructure for this functionality allows to persistently specify – Links to queriable resources – Template queries for these resources (which can be paramatrized by the end-user at runtime) – XSLT transformations allowing to insert the results of queries as HTML fragments, either static or dynamically updated from the resources. Currently we experiment with queriable resources in the form of native XML database (Berkeley, queried via XQuery), which stores PMML data, and semantic knowledge bases both in the form of SPARQL endpoint and Ontopia Knowledge Suite (a Topic Maps tool, queried via a Prolog-like language called tolog). Inclusion of further types of resources such as Lucene indices is in progress.
关 键 词: 数据挖掘; 数据分析; 管理; 数据存储
课程来源: 视频讲座网
最后编审: 2019-11-30:lxf
阅读次数: 56