0


使用连接的空白节点生成用于基准测试的合成RDF数据

Generating Synthetic RDF Data with Connected Blank Nodes for Benchmarking
课程网址: http://videolectures.net/eswc2014_lantzaki_blank_nodes/  
主讲教师: Christina Lantzaki
开课单位: 研究与技术基金会
开课时间: 2014-07-30
课程语种: 英语
中文简介:

合成RDF数据集的生成器对于测试和基准化各种语义数据管理任务(例如查询,存储,更新,比较,集成)非常重要。但是,当前的生成器无法充分支持(或完全忽略)空白节点连接性问题。空白节点用于各种目的(例如,用于描述复杂的属性),并且当前有很大比例的资源用空白节点表示。此外,一些语义数据管理任务(例如同构检查(用于检查等效性)和空白节点匹配(在比较,版本控制,同步以及语义相似性函数中有用))不仅必须处理空白节点,而且还必须处理它们的复杂性。最优性取决于空白节点的连通性。为了能够对执行这些任务的各种技术进行比较评估,在本文中,我们介绍了一种称为BGen的生成器的设计和实现,该生成器允许构建包含具有所需复杂度的空白节点的数据集,并可以通过各种功能(形态,尺寸,直径,密度和聚类系数)。最后,本文报告了有关发电机效率的实验结果,以及使用生成的数据集得出的结果,这些数据证明了发电机的价值

课程简介: Generators for synthetic RDF datasets are very important for testing and benchmarking various semantic data management tasks (e.g. querying, storage, update, compare, integrate). How ever, the current generators do not support sufficiently (or totally ignore) blank node connectivity issues. Blank nodes are used for various purposes (e.g. for describing complex attributes), and a significant percentage of resources is currently represented with blank nodes. Moreover, several semantic data management tasks, like isomorphism checking (useful for checking equivalence), and blank node matching (useful in comparison, versioning, synchronization, and in semantic similarity functions), not only have to deal with blank nodes, but their complexity and optimality depends on the connectivity of blank nodes. To enable the comparative evaluation of the various techniques for carrying out these tasks, in this paper we present the design and implementation of a generator, called BGen, which allows building datasets containing blank nodes with the desired complexity, controllable through various features (morphology, size, diameter, density and clustering coefficient). Finally, the paper reports experimental results concerning the efficiency of the generator, as well as results from using the generated datasets, that demonstrate the valueof the generator
关 键 词: 语义数据; 数据管理
课程来源: 视频讲座网
数据采集: 2021-03-24:zyk
最后编审: 2021-03-24:zyk
阅读次数: 33