0


层次主题与Pachinko分配的混合

Mixtures of Hierarchical Topics with Pachinko Allo cation
课程网址: http://videolectures.net/icml07_mimno_moht/  
主讲教师: David Mimno
开课单位: 马萨诸塞大学
开课时间: 2007-06-23
课程语种: 英语
中文简介:
四级弹球位置模型(PAM)(Li和McCallum,2006)表示使用DAG结构的主题之间的相关性。但是,它不代表主题的嵌套层次结构,其中一些主题词分布表示在几个更具体的主题之间共享的词汇表。本文介绍了分层PAM,它是一种明确表示主题层次结构的增强。这种模型可以看作是将hLD的主题层次结构表示的优点与PAM混合主题层次结构的多个叶子的能力相结合。实验结果表明,保留文件的可能性以及自动发现的主题与人工生成的类别(如期刊)之间的互信息有所改善。
课程简介: The four-level pachinko al location model (PAM) (Li & McCallum, 2006) represents correlations among topics using a DAG structure. It does not, however, represent a nested hierarchy of topics, with some topical word distributions representing the vocabulary that is shared among several more specific topics. This paper presents hierarchical PAM -- an enhancement that explicitly represents a topic hierarchy. This model can be seen as combining the advantages of hLD's topical hierarchy representation with PAM's ability to mix multiple leaves of the topic hierarchy. Experimental results show improvements in likelihood of held-out documents, as well as mutual information between automatically-discovered topics and humangenerated categories such as journals.
关 键 词: 四级弹球位置模型; 主题层次结构; 保留文件
课程来源: 视频讲座网
最后编审: 2019-05-26:cwx
阅读次数: 110