0


聚类中的信息论模型选择

Information Theoretic Model Selection in Clustering
课程网址: http://videolectures.net/nipsworkshops09_buhmann_itm/  
主讲教师: Joachim M. Buhmann
开课单位: 苏黎世联邦理工学院
开课时间: 2010-01-19
课程语种: 英语
中文简介:
聚类中的模型选择要求(i)指定聚类原理,以及(ii)根据数据中的噪声水平确定适当数量的聚类。我们提倡信息理论视角,其中数据集中的不确定性在聚类的解空间中引起不确定性。如果聚类解决方案同样具有信息性,则聚类模型可以容忍数据中比竞争模型更高的噪声水平,被认为是优越的。信息性和鲁棒性之间的这种权衡被用作模型选择标准。解决方案应该从一个数据集推广到同样可能的第二个数据集的请求产生了结构诱导信息的新概念。
课程简介: Model selection in clustering requires (i) to specify a clustering principle and (ii) to decide an appropriate number of clusters depending on the noise level in the data. We advocate an information theoretic perspective where the uncertainty in the data set induces an uncertainty in the solution space of clusterings. A clustering model, which can tolerate a higher level of noise in the data than competing models, is considered to be superior provided that the clustering solution is equally informative. This tradeoff between informativeness and robustness is used as a model selection criterion. The request that solutions should generalize from one data set to an equally probable second data set gives rise to a new notion of structure induced information.
关 键 词: 聚类原理; 噪声水平; 数据集
课程来源: 视频讲座网
最后编审: 2019-09-07:lxf
阅读次数: 52