一种潜在dirichlet分配谱算法][A Spectral Algorithm for Latent Dirichlet Allocation]_MOOC(慕课)境外开放课程

   首页 → 经济学
   首页 → 信息科学与系统科学
   首页 → 数学

一种潜在dirichlet分配谱算法 A Spectral Algorithm for Latent Dirichlet Allocation


课程网址:	http://videolectures.net/machine_hsu_algorithm/
主讲教师:	Daniel Hsu
开课单位:	微软公司
开课时间:	2013-06-14
课程语种:	英语
中文简介:	主题建模是聚类的概括，假定观察（文档中的单词）由\ emph {多个}潜在因素（主题）生成，而不是仅仅一个。这种增加的表征能力是以更具挑战性的无监督学习问题为代价的，当仅观察到单词时，估计主题词分布，并隐藏主题。这项工作提供了一个简单而有效的学习过程，可以保证恢复各种主题模型的参数，包括Latent Dirichlet Allocation（LDA）。对于LDA，该过程正确地使用三元组统计数据（\ emph {ie}，三阶矩，可以使用仅包含三个单词的文档进行估计）正确地恢复主题词分布和Dirichlet之前的主题混合参数。。该方法称为过剩相关分析，基于通过两个奇异值分解（SVD）的低阶矩的谱分解。此外，该算法是可缩放的，因为SVD仅在k×k矩阵上执行，其中k是潜在因子（主题）的数量并且通常远小于观察（字）空间的维度。
课程简介:	Topic modeling is a generalization of clustering that posits that observations (words in a document) are generated by \emph{multiple} latent factors (topics), as opposed to just one. This increased representational power comes at the cost of a more challenging unsupervised learning problem of estimating the topic-word distributions when only words are observed, and the topics are hidden. This work provides a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of topic models, including Latent Dirichlet Allocation (LDA). For LDA, the procedure correctly recovers both the topic-word distributions and the parameters of the Dirichlet prior over the topic mixtures, using only trigram statistics (\emph{i.e.}, third order moments, which may be estimated with documents containing just three words). The method, called Excess Correlation Analysis, is based on a spectral decomposition of low-order moments via two singular value decompositions (SVDs). Moreover, the algorithm is scalable, since the SVDs are carried out only on k×k matrices, where k is the number of latent factors (topics) and is typically much smaller than the dimension of the observation (word) space.
关键词:	主题建模; 聚类; 表征能力
课程来源:	视频讲座网
最后编审:	2020-04-26：chenxin
阅读次数:	71

服务热线：0574-88229129
电子邮件：info_lib@nbt.edu.cn
信息服务：图书馆305室
系统研发：图书馆303室

图书馆学生服务群：437507696
图书馆教工服务群：1038697975
QQ在线咨询
2013-2025 © 浙大宁波理工学院图书馆