0


Mining for the Most Certain Predictions from Dyadic Data [从二元数据中挖掘最确定的预测

Mining for the Most Certain Predictions from Dyadic Data [从二元数据中挖掘最确定的预测
课程网址: http://videolectures.net/kdd09_deodhar_mmcpdd/  
主讲教师: Meghana Deodhar
开课单位: 德克萨斯大学
开课时间: 2009-09-14
课程语种: 英语
中文简介:
在涉及回归或分类的若干应用中,除了进行预测之外,评估单个预测的准确性或可靠性也很重要。在由于有限资源或域要求而需要仅基于最可靠而不是整个预测集做出决策的情况下,这尤其重要。本文介绍了对于涉及具有二元结构的大规模异构数据的问题的准确性排序预测的新颖有效方法,即,其中自变量可以自然地分解为与两组元素及其组合相关联的三个组。这些方法基于通过一系列本地化模型对数据建模,同时对数据进行分区(共聚类)。对于回归,这导致了“确定性提升”的概念。我们还开发了一种强大的预测建模技术,该技术仅识别和建模数据中最连贯的区域,从而在所选择的响应值子集上提供高预测精度。对现实生活数据集的广泛实验突出了我们提出的方法的实用性。
课程简介: In several applications involving regression or classification, along with making predictions it is important to assess how accurate or reliable individual predictions are. This is particularly important in cases where due to finite resources or domain requirements, one wants to make decisions based only on the most reliable rather than on the entire set of predictions. This paper introduces novel and effective ways of ranking predictions by their accuracy for problems involving large-scale, heterogeneous data with a dyadic structure, i.e., where the independent variables can be naturally decomposed into three groups associated with two sets of elements and their combination. These approaches are based on modeling the data by a collection of localized models learnt while simultaneously partitioning (co-clustering) the data. For regression this leads to the concept of "certainty lift". We also develop a robust predictive modeling technique that identifies and models only the most coherent regions of the data to give high predictive accuracy on the selected subset of response values. Extensive experimentation on real life datasets highlights the utility of our proposed approaches.
关 键 词: 异构数据; 本地化模型; 预测建模技术
课程来源: 视频讲座网
最后编审: 2020-06-08:yumf
阅读次数: 69