0


跨未知环境的稳定预测

Stable Prediction across Unknown Environments
课程网址: http://videolectures.net/kdd2018_kuang_prediction_environments/  
主讲教师: Kun Kuang
开课单位: 清华大学计算机科学与技术系
开课时间: 2018-11-23
课程语种: 英语
中文简介:
在许多重要的机器学习应用中,用于学习概率分类器的训练分布不同于分类器用于进行预测的分布。传统方法通过用测试数据和训练数据之间的密度比重新加权训练数据来校正分布偏移。然而,在许多应用程序中,培训是在事先不了解测试分布的情况下进行的。最近,有人提出了通过学习潜在的因果结构来解决这一转变的方法,但这些方法依赖于多个训练数据集产生的多样性,并且它们在高维度上进一步具有复杂性限制。在本文中,我们提出了一种新的深度全局平衡回归(DGBR)算法,以联合优化用于特征选择的深度自动编码器模型和用于跨未知环境稳定预测的全局平衡模型。全局平衡模型构建了平衡权重,有助于估计特征的部分影响(固定所有其他特征),这是一个在高维度上具有挑战性的问题,因此有助于确定特征和结果之间的稳定因果关系。深度自动编码器模型旨在降低特征空间的维数,从而使全局平衡更容易。我们从理论和实证两方面表明,我们的算法可以在未知环境中做出稳定的预测。我们在合成和真实数据集上的实验表明,我们的算法在未知环境下的稳定预测方面优于最先进的方法。
课程简介: In many important machine learning applications, the training distribution used to learn a probabilistic classifier differs from the distribution on which the classifier will be used to make predictions. Traditional methods correct the distribution shift by reweighting training data with the ratio of the density between test and training data. However, in many applications training takes place without prior knowledge of the testing distribution. Recently, methods have been proposed to address the shift by learning the underlying causal structure, but those methods rely on diversity arising from multiple training data sets, and they further have complexity limitations in high dimensions. In this paper, we propose a novel Deep Global Balancing Regression (DGBR) algorithm to jointly optimize a deep auto-encoder model for feature selection and a global balancing model for stable prediction across unknown environments. The global balancing model constructs balancing weights that facilitate estimation of partial effects of features (holding fixed all other features), a problem that is challenging in high dimensions, and thus helps to identify stable, causal relationships between features and outcomes. The deep auto-encoder model is designed to reduce the dimensionality of the feature space, thus making global balancing easier. We show, both theoretically and with empirical experiments, that our algorithm can make stable predictions across unknown environments. Our experiments on both synthetic and real datasets demonstrate that our algorithm outperforms the state-of-the-art methods for stable prediction across unknown environments.
关 键 词: 重要的机器学习应用; 学习概率分类器; 进行预测的分布; 全局平衡模型
课程来源: 视频讲座网
数据采集: 2023-01-30:cyh
最后编审: 2023-01-31:cyh
阅读次数: 30