信息检索中潜在dirichlet分配的敏感性The Sensitivity of Latent Dirichlet Allocation for Information Retrieval |
|
课程网址: | http://videolectures.net/ecmlpkdd09_park_sldair/ |
主讲教师: | Laurence A. F. Park |
开课单位: | 墨尔本大学 |
开课时间: | 2009-10-20 |
课程语种: | 英语 |
中文简介: | 已经表明,当以适当的形式使用时,使用主题模型进行信息检索可以提高精度。 Latent Dirichlet Allocation(LDA)是一个生成主题模型,它允许我们使用Dirichlet先验模型化文档。使用该主题模型,我们能够获得拟合的Dirichlet参数,该参数提供文档集的最大可能性。在本文中,我们研究了LDA在用于信息检索时对Dirichlet参数的敏感性。我们用一致的Dirichlet先验比较了主题模型计算时间,存储要求和拟合LDA与LDA的检索精度。结果表明,使用拟合LDA而不是具有恒定Dirichlet参数的LDA没有显着益处,因此表明LDA在用于信息检索时对Dirichlet参数不敏感。 |
课程简介: | It has been shown that the use of topic models for Information retrieval provides an increase in precision when used in the appropriate form. Latent Dirichlet Allocation (LDA) is a generative topic model that allows us to model documents using a Dirichlet prior. Using this topic model, we are able to obtain a fitted Dirichlet parameter that provides the maximum likelihood for the document set. In this article, we examine the sensitivity of LDA with respect to the Dirichlet parameter when used for Information retrieval. We compare the topic model computation times, storage requirements and retrieval precision of fitted LDA to LDA with a uniform Dirichlet prior. The results show there there is no significant benefit of using fitted LDA over the LDA with a constant Dirichlet parameter, hence showing that LDA is insensitive with respect to the Dirichlet parameter when used for Information retrieval. |
关 键 词: | 主题模型; 信息检索; 模型化文档 |
课程来源: | 视频讲座网 |
最后编审: | 2019-03-27:lxf |
阅读次数: | 60 |