0


DBpedia SPARQL的基准—真实数据实时查询的绩效评估

DBpedia SPARQL Benchmark – Performance Assessment with Real Queries on Real Data
课程网址: http://videolectures.net/iswc2011_morsey_real/  
主讲教师: Mohamed Morsey
开课单位: 莱比锡大学
开课时间: 2011-11-25
课程语种: 英语
中文简介:
三层存储是越来越多的数据Web应用程序的主干。由此可见,这些存储的性能对于单个项目以及数据Web上的数据集成都是至关重要的。因此,在任何这些应用程序的实现过程中,清楚地了解当前三层存储实现的弱点和优势是至关重要的。在本文中,我们提出了一个通用的SPARQL基准创建过程,并将其应用到DBpedia知识库中。以前的方法经常比较关系存储和三重存储,因此解决了使用类似SQL的查询转换为RDF的关系数据库来测量性能的问题。与这些方法相比,我们的基准是基于人类和应用程序针对不类似于关系模式的现有RDF数据实际发出的查询。我们创建基准的一般过程基于查询日志挖掘、集群和SPARQL特性分析。我们认为,纯SPARQL基准对于比较现有的三重存储更有用,并为流行的三重存储实现virtuoso、sesame、jena tdb和bigolim提供结果。随后将我们的结果与其他基准结果进行比较,结果表明,三重存储的性能远不如以前的基准所建议的一致性。
课程简介: Triple stores are the backbone of increasingly many Data Web applications. It is thus evident that the performance of those stores is mission critical for individual projects as well as for data integration on the Data Web in general. Consequently, it is of central importance during the implementation of any of these applications to have a clear picture of the weaknesses and strengths of current triple store implementations. In this paper, we propose a generic SPARQL benchmark creation procedure, which we apply to the DBpedia knowledge base. Previous approaches often compared relational and triple stores and, thus, settled on measuring performance against a relational database which had been converted to RDF by using SQL-like queries. In contrast to those approaches, our benchmark is based on queries that were actually issued by humans and applications against existing RDF data not resembling a relational schema. Our generic procedure for benchmark creation is based on query-log mining, clustering and SPARQL feature analysis.We argue that a pure SPARQL benchmark is more useful to compare existing triple stores and provide results for the popular triple store implementations Virtuoso, Sesame, Jena-TDB, and BigOWLIM. The subsequent comparison of our results with other benchmark results indicates that the performance of triple stores is by far less homogeneous than suggested by previous benchmarks.
关 键 词: 数据网络应用; 关系数据库; 查询日志挖掘; 特征分析
课程来源: 视频讲座网
最后编审: 2019-12-05:lxf
阅读次数: 54