0


MDL教程

MDL Tutorial
课程网址: http://videolectures.net/icml08_grunwald_mld/  
主讲教师: Peter Grünwald
开课单位: 中央数学与信息中心
开课时间: 2008-08-12
课程语种: 英语
中文简介:
我们提供了一个关于建模、学习和预测的最小描述长度(MDL)方法的独立教程。我们关注MDL的最近(1995年后)公式,它与机器学习和UAI社区中经常被称为“MDL”的旧方法截然不同。 在它的现代外表下,MDL是基于“通用模型”的概念。我们将详细解释这个概念。我们表明,MDL的以前版本(基于所谓的两部分代码)、贝叶斯模型选择和预测验证(交叉验证的一种变体)都可以被解释为基于“通用模型”的模型选择的近似。现代MDL规定使用某种“最优”通用模型,即所谓的“归一化最大似然模型”或“什塔尔科夫分布”。这与具有非信息先验的贝叶斯模型选择相关(但又不同)。它导致了对“复杂”模型的惩罚,这些模型可以给出直观的微分几何解释。粗略地说,参数化模型的复杂性与它所包含的可区分的概率分布的数量直接相关。我们还讨论了一些最近的扩展,如“幸运原则”,它可以在什塔尔科夫分布未定义的情况下使用,以及“开关分布”,它允许解决AIC-BIC困境。
课程简介: We give a self-contained tutorial on the Minimum Description Length (MDL) approach to modeling, learning and prediction. We focus on the recent (post 1995) formulations of MDL, which can be quite different from the older methods that are often still called 'MDL' in the machine learning and UAI communities. In its modern guise, MDL is based on the concept of a `universal model'. We explain this concept at length. We show that previous versions of MDL (based on so-called two-part codes), Bayesian model selection and predictive validation (a variation of cross-validation) can all be interpreted as approximations to model selection based on 'universal models'. Modern MDL prescribes the use of a certain `optimal' universal model, the so-called `normalized maximum likelihood model' or `Shtarkov distribution'. This is related to (yet different from) Bayesian model selection with non-informative priors. It leads to a penalization of `complex' models that can be given an intuitive differential-geometric interpretation. Roughly speaking, the complexity of a parametric model is directly related to the number of distinguishable probability distributions that it contains. We also discuss some recent extensions such as the 'luckiness principle', which can be used if the Shtarkov distribution is undefined, and the 'switch distribution', which allows for a resolution of the AIC-BIC dilemma.
关 键 词: 通用模型; 预测验证; 参数化模型
课程来源: 视频讲座网
数据采集: 2022-11-10:chenjy
最后编审: 2022-11-10:chenjy
阅读次数: 34