0


序列数据的随机存储器

A Stochastic Memoizer for Sequence Data
课程网址: http://videolectures.net/icml09_wood_sms/  
主讲教师: Frank Wood
开课单位: 伦敦大学学院
开课时间: 2009-08-26
课程语种: 英语
中文简介:
我们为离散序列数据提出了一个无界深度,分层,贝叶斯非参数模型。该模型可以从单个训练序列估计,但是在子结构符号预测分布之间共享统计强度,例如预测性能很好地推广。该模型建立在无界深度分层Pitman Yor过程的特定参数化基础之上。我们引入分析边缘化步骤(使用凝血操作员)将该模型减少到可以在训练序列的长度上线性地表示的时间和空间。我们展示了如何在没有截断近似的情况下在这样的模型中进行推理,并引入了做预测推理所必需的碎片算子。我们通过将序列记忆器用作语言模型来演示序列记忆器,从而实现最先进的结果。
课程简介: We propose an unbounded-depth, hierarchical, Bayesian nonparametric model for discrete sequence data. This model can be estimated from a single training sequence, yet shares statistical strength between subsequent symbol predictive distributions in such a way that predictive performance generalizes well. The model builds on a specific parameterization of an unbounded-depth hierarchical Pitman-Yor process. We introduce analytic marginalization steps (using coagulation operators) to reduce this model to one that can be represented in time and space linear in the length of the training sequence. We show how to perform inference in such a model without truncation approximation and introduce fragmentation operators necessary to do predictive inference. We demonstrate the sequence memoizer by using it as a language model, achieving state-of-the-art results.
关 键 词: 离散序列数据; 贝叶斯非参数模型; 无界深度分层
课程来源: 视频讲座网
最后编审: 2019-04-24:lxf
阅读次数: 28