部分隶属度的统计模型Statistical Models for Partial Membership |
|
课程网址: | http://videolectures.net/icml08_heller_smpm/ |
主讲教师: | Heller Katherine A |
开课单位: | 杜克大学 |
开课时间: | 2008-08-01 |
课程语种: | 英语 |
中文简介: | 我们提出了一个原则性的贝叶斯框架来建模数据点到集群的部分成员关系。与假定每个数据点属于一个且仅属于一个混合组件或集群的标准混合模型不同,部分成员模型允许数据点在多个集群中具有部分成员资格。将数据点部分成员分配给聚类的算法可用于基于微阵列数据的基因聚类、全局定位和轨道确定等任务。我们的贝叶斯部分隶属度模型(BPM)使用指数族分布对每个集群进行建模,并使用这些分布的乘积和加权参数对每个数据点进行建模。这里的权重对应于数据点属于每个集群的程度。BPM中的所有参数都是连续的,因此我们可以使用混合蒙特卡罗进行推理和学习。我们讨论了bpm与潜在dirichlet分配、混合成员模型、指数族PCA和模糊聚类之间的关系。最后,我们给出了一些实验结果,并讨论了模型的非参数扩展。 |
课程简介: | We present a principled Bayesian framework for modeling partial memberships of data points to clusters. Unlike a standard mixture model which assumes that each data point belongs to one and only one mixture component, or cluster, a partial membership model allows data points to have fractional membership in multiple clusters. Algorithms which assign data points partial memberships to clusters can be useful for tasks such as clustering genes based on microarray data and global positioning and orbit determination. Our Bayesian Partial Membership Model (BPM) uses exponential family distributions to model each cluster, and a product of these distibtutions, with weighted parameters, to model each datapoint. Here the weights correspond to the degree to which the datapoint belongs to each cluster. All parameters in the BPM are continuous, so we can use Hybrid Monte Carlo to perform inference and learning. We discuss relationships between the BPM and Latent Dirichlet Allocation, Mixed Membership models, Exponential Family PCA, and fuzzy clustering. Lastly, we show some experimental results and discuss nonparametric extensions to our model. |
关 键 词: | 建模数据点; 贝叶斯框架; 聚类基因 |
课程来源: | 视频讲座网 |
最后编审: | 2019-11-30:lxf |
阅读次数: | 41 |