0


从标记和未标记的数据中学习:保持平稳性假设时

Learning from Labeled and Unlabelled Data: When the Smoothness Assumption Holds
课程网址: http://videolectures.net/solomon_ceci_ssl/  
主讲教师: Michelangelo Ceci
开课单位: 巴里大学
开课时间: 2011-03-11
课程语种: 英语
中文简介:
近年来,人们越来越对能够利用标记和未标记数据进行预测任务的学习算法感兴趣。引起关注的原因是分配标签的成本对于大型数据集而言可能非常高。文献中提出了两个主要设置来利用标记和未标记数据中包含的信息:半监督设置和转导设置。前者是归纳学习的一种,因为所学习的功能用于对任何可能的观测进行预测。后者要求较少,因为它只对预测在学习时已知的一组未标记数据感兴趣。通过关注转导设置,我们讨论了基本平滑度假设及其对(正)自相关表征的几种数据类型的有效性,例如空间和网络数据。特别是,我们报告了转导学习方法在这些数据类型和在以标记数据稀缺为特征的领域中获得的结果中的应用。最后,我们在关系数据挖掘的更一般角度讨论了转导设置。
课程简介: During recent years, there has been a growing interest in learning algorithms capable of utilizing both labeled and unlabeled data for prediction tasks. The reason for this attention is the cost of assigning labels which can be very high for large datasets. Two main settings have been proposed in the literature to exploit information contained in both labeled and unlabeled data: the semi-supervised setting and the transductive setting. The former is a type of inductive learning, since the learned function is used to make predictions on any possible observation. The latter asks for less, since it is only interested in making predictions for a set of unlabeled data known at the learning time. By focusing on the transductive setting, we discuss the underlying smoothness assumption and its validity for several data types characterized by (positive) autocorrelation, such as spatial and networked data. In particular, we report of the application of transductive learning approaches to these data types and results obtained in domains characterized by scarcity of labelled data. Finally, we discuss the transductive setting in the more general perspective of relational data mining.
关 键 词: 未标记数据; 分配标签; 转导设置
课程来源: 视频讲座网
最后编审: 2019-09-21:cwx
阅读次数: 35