
Finding a Better k: A Psychophysical Investigation of Clustering
课程网址: http://videolectures.net/nipsworkshops09_lewis_fbp/  
主讲教师: Joshua M. Lewis
开课单位: 加州大学圣地亚哥分校
开课时间: 2010-01-19
课程语种: 英语
找到数据集中的组数k是在无监督机器学习领域中的一个重要问题,其应用跨越许多科学领域。然而问题是困难的,因为它是模糊和分层的,并且当前用于找到k的技术经常产生不令人满意的结果。人类擅长导航模糊和分层情况,并且本文测量人类在跨越各种各样的数据找到k的问题上的表现。集。我们发现人类通常同时采用多种策略来进行选择,甚至简单数据集的可能解释数量也很少(N <20)。此外,将双重机器学习算法与人类结果进行比较。
课程简介: Finding the number of groups in a data set, k, is an important problem in the field of unsupervised machine learning with applications across many scientific domains. The problem is difficult however, because it is ambiguous and hierarchical, and current techniques for finding k often produce unsatisfying results. Humans are adept at navigating ambiguous and hierarchical situations, and this paper measures human performance on the problem of finding k across a wide variety of data sets. We find that humans employ multiple strategies for choosing k, often simultaneously, and the number of possible interpretations of even simple data sets with very few (N < 20) samples can be quite high. In addition, two leading machine learning algorithms are compared to the human results.
关 键 词: 数据集; 机器学习; 组数k
课程来源: 视频讲座网
最后编审: 2019-09-07:lxf
阅读次数: 51