0


课程网址: http://videolectures.net/ecmlpkdd09_segond_bngutd/  
主讲教师: Marc Segond
开课单位: 欧洲软计算中心
开课时间: 2009-10-20
课程语种: 英语
中文简介:
根据 koestler 的观点, 双性的概念意味着信息片段之间的联系, 这些信息与习惯分离的领域或类别之间存在联系。在本文中, 我们考虑了一种方法来找到这种双性使用的知识的网络表示, 这就是所谓的双性知识, 因为它承诺包含双性。在第一步中, 我们考虑如何从从 di 中获取的多个文本数据库创建 bisonet 使用简单的文本挖掘技术的传入域。为了实现这一目标, 我们引入了一个过程来链接 bisonet 的节点, 并赋予这种链接到权重, 该方法基于一种新的比较文本频率向量的度量方法。在第二步中, 我们尝试重新发现已知的双性疾病, 这最初是由人类领域的专家发现的, 即偏头痛和镁之间的间接关系, 因为它们隐藏在1987年之前发表的医学研究文章中。我们观察到, 这些双性很容易通过遵循最强的链接重新发现。未来的工作包括将我们的方法扩展到非文本数据, 改进相似性度量, 以及应用更复杂的图形挖掘方法。
课程简介: According to Koestler, the notion of a bisociation denotes a connection between pieces of information from habitually separated domains or categories. In this paper, we consider a methodology to find such bisociations using a network representation of knowledge, which is called a BisoNet, because it promises to contain bisociations. In a fi rst step, we consider how to create BisoNets from several textual databases taken from di fferent domains using simple text-mining techniques. To achieve this, we introduce a procedure to link nodes of a BisoNet and to endow such links with weights, which is based on a new measure for comparing text frequency vectors. In a second step, we try to rediscover known bisociations, which were originally found by a human domain expert, namely indirect relations between migraine and magnesium as they are hidden in medical research articles published before 1987. We observe that these bisociations are easily rediscovered by simply following the strongest links. Future work includes extending our methods to non-textual data, improving the similarity measure, and applying more sophisticated graph mining methods.
关 键 词: 文本挖掘; 双联; 图形挖掘
课程来源: 视频讲座网
最后编审: 2020-06-15:wuyq
阅读次数: 70