0


使用形状表达式(ShEx)共享RDF数据模型并指导严格验证

Using Shape Expressions (ShEx) to Share RDF Data Models and to Guide Curation with Rigorous Validation
课程网址: http://videolectures.net/eswc2019_thornton_shape_expressions/  
主讲教师: Katherine Thornton
开课单位: 耶鲁大学
开课时间: 2019-09-19
课程语种: 英语
中文简介:
我们讨论Shape Expressions(ShEx),这是一种简洁、正式的RDF结构建模和验证语言。例如,形状表达式可以规定给定RDF图中属于“Paper”形状的主题应有一个名为“Abstract”的部分,任何ShEx实现都可以确认给定图或子图中的所有此类主题是否确实如此。 目前有五个积极维护的ShEx实施。我们讨论了如何在不同的应用环境中使用RDF数据验证工作流中的JavaScript、Scala和Python实现。我们展示了如何使用ShEx对来自两个不同来源的数据进行建模和验证的示例,这两个来源是特定领域的快速医疗互操作性资源(FHIR)和领域通用Wikidata知识库,该知识库是由Wikimedia Foundation作为Wikipedia的姊妹项目构建和维护的链接数据库。本文还介绍了使用Wikidata作为数据管理平台的示例项目,以及使用ShEx进行建模和验证的方式。 当重用他人创建的RDF图时,了解数据是如何表示的很重要。当前使用人类可读的描述或本体来传达数据结构的实践通常缺乏足够的精度,数据消费者无法快速、轻松地理解数据表示细节。我们提供了具体的示例,说明如何将ShEx用作约束和验证语言,使人和机器能够明确地就数据资产进行通信。我们使用ShEx交换和理解不同来源的数据模型,并在链接数据源中表示资源占用的共享模型。我们还使用ShEx灵活地开发数据模型,根据样本数据对其进行测试,并对其进行修改或完善。ShEx的表达能力使我们能够在输入时和通过批量检查有效地捕捉不一致、不一致或错误。
课程简介: We discuss Shape Expressions (ShEx), a concise, formal, modeling and validation language for RDF structures. For instance, a Shape Expression could prescribe that subjects in a given RDF graph that fall into the shape “Paper” are expected to have a section called “Abstract”, and any ShEx implementation can confirm whether that is indeed the case for all such subjects within a given graph or subgraph. There are currently five actively maintained ShEx implementations. We discuss how we use the JavaScript, Scala and Python implementations in RDF data validation workflows in distinct, applied contexts. We present examples of how ShEx can be used to model and validate data from two different sources, the domain-specific Fast Healthcare Interoperability Resources (FHIR) and the domain-generic Wikidata knowledge base, which is the linked database built and maintained by the Wikimedia Foundation as a sister project to Wikipedia. Example projects that are using Wikidata as a data curation platform are presented as well, along with ways in which they are using ShEx for modeling and validation. When reusing RDF graphs created by others, it is important to know how the data is represented. Current practices of using human-readable descriptions or ontologies to communicate data structures often lack sufficient precision for data consumers to quickly and easily understand data representation details. We provide concrete examples of how we use ShEx as a constraint and validation language that allows humans and machines to communicate unambiguously about data assets. We use ShEx to exchange and understand data models of different origins, and to express a shared model of a resource’s footprint in a Linked Data source. We also use ShEx to agilely develop data models, test them against sample data, and revise or refine them. The expressivity of ShEx allows us to catch disagreement, inconsistencies, or errors efficiently, both at the time of input, and through batch inspections.
关 键 词: 使用形状表达式; 共享RDF数据模型; 快速医疗互操作性资源; 使用RDF数据验证工作流; 正式的RDF结构建模; 指导严格验证; 领域通用Wikidata知识库
课程来源: 视频讲座网
数据采集: 2022-09-28:cyh
最后编审: 2022-09-28:cyh
阅读次数: 26