0


基准链接开放数据技术

Benchmarking Linked Open Data technology
课程网址: http://videolectures.net/dataforum2012_boncz_benchmarking/  
主讲教师: Peter Boncz
开课单位: 荷兰计算机科学与数学研究中心
开课时间: 2012-07-16
课程语种: 英语
中文简介:
在本次演讲中,我们介绍了SRBench,它是Streaming RDF存储引擎的第一个基准,它完全基于真实世界的数据集。随着越来越多的流数据问题,但没有足够的工具来获取甚至从这些数据中获取知识,研究人员已经开始寻找解决方案,其中语义Web技术被适应和扩展,用于发布,共享,分析和理解这些数据。出现了各种方法,例如,C SPARQL,SPARQLStream,StreamSPARQL和CQELS。为了帮助研究人员和用户在标准化应用场景中比较流式RDF引擎,我们提出了SRBench,利用它可以评估流式RDF引擎的能力,以应对在现实世界场景中通常遇到的各种用例。 SRBench的设计基于对数据流管理系统和流式RDF处理引擎以及现有RDF / SPARQL基准测试中的最新技术的广泛研究。这确保我们捕获基准测试中流式RDF处理的所有重要方面.SRBench的第一个目标是评估流式RDF引擎的功能完整性。该基准测试包含一组简明而全面的查询,涵盖了流式SPARQL查询处理的主要方面,从简单的模式匹配查询到具有复杂推理任务的查询。将语义Web技术应用于流数据的主要优点包括通过向数据添加语义,通过本体推理以及与其他数据集的集成来提供更好的搜索功能。基准测试可以访问流式RDF引擎处理这些独特功能的能力,这些查询不仅可以对流式传感器数据进行推理,还可以对链接开放数据(LOD)云中的元数据甚至其他数据集进行推理。为了给出第一个基线并说明现有技术水平,我们展示了使用Polit cnica de Madrid(UPM)实施SRBench所获得的结果。该引擎支持流式RDF查询语言,也称为SPARQLStream。评估显示SPARQLStream支持的功能相当完整。在语言层面,它能够轻松简洁地表达所有基准查询。在查询处理级别,已发现一些缺失的功能,所有这些功能都已添加初级代码以供进一步开发。
课程简介: In this talk, we present SRBench, the first benchmark for Streaming RDF Storage Engines, which is completely based on real-world datasets. With the increasing problem of too much streaming data but not enough tools to gain and even derive knowledge from those data, researchers have set out for solutions in which Semantic Web technologies are adapted and extended for the publishing, sharing, analysing and understanding of such data. Various approaches are emerging, , e.g., C-SPARQL, SPARQLStream, StreamSPARQL and CQELS. To help researchers and users to compare streaming RDF engines in a standardised application scenario, we propose SRBench, with which one can assess the abilities of a streaming RDF engine to cope with a broad range of use cases typically encountered in real-world scenarios. The design of SRBench is based on an extensive study of the state-of-the-art techniques in both the data stream management systems and the streaming RDF processing engines, and the existing RDF/SPARQL benchmarks. This ensures that we capture all important aspects of streaming RDF processing in the benchmark. The first goal of SRBench is to evaluate the functional completeness of a streaming RDF engine. The benchmark contains a concise, yet comprehensive set of queries which covers the major aspects of streaming SPARQL query processing, ranging from simple pattern matching queries to queries with complex reasoning tasks. The main advantages of applying Semantic Web technologies on streaming data include providing better search facilities by adding semantics to the data, reasoning through ontologies, and integration with other data sets. The ability of a streaming RDF engine to process these distinctive features is accessed by the benchmark with queries that apply reasoning not only over the streaming sensor data, but also over the metadata and even other data sets in the Linked Open Data (LOD) cloud. To give a first baseline and illustrate the state of the art, we show results obtained from implementing SRBench using the Polit cnica de Madrid (UPM). The engine supports the streaming RDF query language, also called SPARQLStream. The evaluation shows that the functionality supported by SPARQLStream is fairly complete. At the language level, it is able to express all benchmark queries easily and concisely. At the query processing level, some missing features have been discovered, for all of which preliminary code has been added for further development.
关 键 词: 流数据; 标准化应用; 存储引擎
课程来源: 视频讲座网
最后编审: 2019-03-10:cwx
阅读次数: 112