0


从文本中提取语义网络的关系聚类

Extracting Semantic Networks from Text via Relational Clustering
课程网址: http://videolectures.net/ecmlpkdd08_kok_esnft/  
主讲教师: Pedro Domingos; Stanley Kok
开课单位: 华盛顿大学
开课时间: 2008-10-10
课程语种: 英语
中文简介:
从文本中提取知识一直是人工智能的目标。最初的方法纯粹是逻辑性的和脆弱的。最近,网络上大量文本的可用性导致了机器学习方法的发展。然而,与一般知识相比,迄今为止,这些知识主要是提取基本事实。其他学习方法可以提取逻辑形式,但需要监督,而不是规模。本文提出了一种从大量文本中提取语义网络的无监督方法。我们使用TextRunner系统[1]从文本中提取元组,然后通过将元组中的对象和关系字符串联合起来,从中归纳出一般概念和关系。我们的方法用马尔可夫逻辑定义,使用四个简单的规则。对200万个元组的数据集进行的实验表明,它优于其他三种关系聚类方法,并提取出有意义的语义网络。
课程简介: Extracting knowledge from text has long been a goal of AI. Initial approaches were purely logical and brittle. More recently, the availability of large quantities of text on the Web has led to the development of machine learning approaches. However, to date these have mainly extracted ground facts, as opposed to general knowledge. Other learning approaches can extract logical forms, but require supervision and do not scale. In this paper we present an unsupervised approach to extracting semantic networks from large volumes of text. We use the TextRunner system [1] to extract tuples from text, and then induce general concepts and relations from them by jointly clustering the objects and relational strings in the tuples. Our approach is defined in Markov logic using four simple rules. Experiments on a dataset of two million tuples show that it outperforms three other relational clustering approaches, and extracts meaningful semantic networks.
关 键 词: 网络分析; 机器学习; 聚类; 计算机科学; 文本挖掘
课程来源: 视频讲座网
最后编审: 2021-02-10:nkq
阅读次数: 37