
Independent Factor Topic Models
课程网址: http://videolectures.net/icml09_putthividhya_iftm/  
主讲教师: Duangmanee (Pew) Putthividhya
开课单位: 加州大学圣地亚哥分校
开课时间: 2009-08-26
课程语种: 英语
诸如Latent Dirichlet Allocation(LDA)和Correlated Topic Model(CTM)之类的主题模型最近已成为用于文本文档建模的强大统计工具。在本文中,我们改进了CTM并提出了独立因子主题模型(IFTM),它使用线性潜变量模型来揭示主题之间隐藏的相关性来源。这项工作有两个主要贡献。首先,通过使用稀疏源先验模型,我们可以直接可视化主题相关的稀疏模式。其次,隐含源变量使用中隐含的条件独立性假设允许目标函数分解,从而导致基于牛顿拉尔夫森的快速变分推理算法。合成和实际数据的实验结果表明,IFTM平均运行速度比CTM快3.5倍,同时通过困惑度和保持数据的对数可能性来衡量竞争性能。
课程简介: Topic models such as Latent Dirichlet Allocation (LDA) and Correlated Topic Model (CTM) have recently emerged as powerful statistical tools for text document modeling. In this paper, we improve upon CTM and propose Independent Factor Topic Models (IFTM) which use linear latent variable models to uncover the hidden sources of correlation between topics. There are 2 main contributions of this work. First, by using a sparse source prior model, we can directly visualize sparse patterns of topic correlations. Secondly, the conditional independence assumption implied in the use of latent source variables allows the objective function to factorize, leading to a fast Newton- Ralphson based variational inference algorithm. Experimental results on synthetic and real data show that IFTM runs on average 3-5 times faster than CTM, while giving competitive performance as measured by perplexity and log-likelihood of held-out data.
关 键 词: 主题模型; 独立因子; 可视化主题
课程来源: 视频讲座网
最后编审: 2019-04-24:cwx
阅读次数: 22