生物医学本体论中术语重用和术语重叠的系统分析A Systematic Analysis of Term Reuse and Term Overlap across Biomedical Ontologies |
|
课程网址: | http://videolectures.net/iswc2018_kamdar_systematic_analysis_term... |
主讲教师: | Maulik R. Kamdar |
开课单位: | 斯坦福大学医学院 |
开课时间: | 2018-11-22 |
课程语种: | 英语 |
中文简介: | 重用本体及其术语是大多数本体开发方法强烈鼓励的原则和最佳实践。重用带来了支持语义互操作性和降低工程成本的承诺。在本文中,我们对生物医学本体之间术语重用和重叠的当前程度进行了描述性研究。我们使用存储在BioPortal存储库中的生物医学本体语料库,并分析不同类型的重用和重叠构造。虽然我们发现约有25-31%的术语重叠,但术语重用率仅为<9%,大多数本体重用的术语少于一小部分流行本体的5%。聚类分析表明,由一组通用本体重用的术语具有90%以上的语义相似性,这暗示本体开发人员倾向于重用兄弟节点或父子节点的术语。我们通过分析Protégé插件生成的日志来验证这一发现,该插件使开发人员能够重用BioPortal中的术语。我们发现大多数重用结构都是类层次结构的较高级别上的2级子树。我们开发了一个Web应用程序,它可视化了重用依赖关系和本体之间的重叠,并为感兴趣的术语从BioPortal中提出了类似的术语。我们还发现了一组错误模式,表明本体开发人员确实打算重用其他本体中的术语,但他们使用的是不同的,有时是不正确的表示。我们的结果规定了对半自动化工具的需求,这些工具通过个性化推荐来增强本体工程过程中的术语重用。 |
课程简介: | Reusing ontologies and their terms is a principle and best practice that most ontology development methodologies strongly encourage. Reuse comes with the promise to support the semantic interoperability and to reduce engineering costs. In this paper, we present a descriptive study of the current extent of term reuse and overlap among biomedical ontologies. We use the corpus of biomedical ontologies stored in the BioPortal repository, and analyze different types of reuse and overlap constructs. While we find an approximate term overlap between 25–31%, the term reuse is only <9%, with most ontologies reusing fewer than 5% of their terms from a small set of popular ontologies. Clustering analysis shows that the terms reused by a common set of ontologies have >90% semantic similarity, hinting that ontology developers tend to reuse terms that are sibling or parent–child nodes. We validate this finding by analysing the logs generated from a Protégé plugin that enables developers to reuse terms from BioPortal. We find most reuse constructs were 2-level subtrees on the higher levels of the class hierarchy. We developed a Web application that visualizes reuse dependencies and overlap among ontologies, and that proposes similar terms from BioPortal for a term of interest. We also identified a set of error patterns that indicate that ontology developers did intend to reuse terms from other ontologies, but that they were using different and sometimes incorrect representations. Our results stipulate the need for semi-automated tools that augment term reuse in the ontology engineering process through personalized recommendations. |
关 键 词: | 重用本体及其术语; 生物医学本体之间; 分析不同类型的重用; 分析Protégé插件生成的日志 |
课程来源: | 视频讲座网 |
数据采集: | 2022-12-29:cyh |
最后编审: | 2023-05-15:cyh |
阅读次数: | 44 |