0


多任务学习的垢模型

A Dirty Model for Multi-task Learning
课程网址: http://videolectures.net/nips2010_jalali_dmm/  
主讲教师: Ali Jalali
开课单位: 德克萨斯大学
开课时间: 2011-01-12
课程语种: 英语
中文简介:
我们考虑多元线性回归问题,在这样一个环境中,一些相关特性可以在任务之间共享。最近的许多研究已经研究了使用l1 lq范数块正则化和q和1来处理这类(可能的)块结构问题,建立了即使在高维尺度下,特征数量与观测数量成比例的恢复的有力保证。然而,这些论文也注意到,这种块正则化方法的性能非常依赖于跨任务共享的特性。事实上,它们表明,如果重叠程度小于阈值,或者即使共享特征中的参数高度不均匀,那么块l1 lq正则化实际上可以比简单的单独元素l1正则化执行。我们远离现实的多任务设置:不仅任务之间的相关特性集必须完全相同,而且它们的值也必须相同。在这里,我们问一个问题:当支持和参数重叠存在时,我们是否可以利用它,而当它不存在时,我们不需要支付罚金?事实上,这属于一个更普遍的问题,即我们是否可以对这样的模型进行建模,这样的模型可能不会落在一个整齐的结构支架中(所有块都是稀疏的,或所有低阶的,等等)。在这里,我们采取了第一步,重点是为多元回归问题开发一个脏模型。我们的方法使用了一个非常简单的想法:我们将参数分解为两个组件,并以不同的方式对它们进行正则化。我们从理论和经验上都表明,我们的方法在整个可能重叠的范围内,严格而显著地优于L1和L1 LQ方法。同时也为该方法在高维尺度下的应用提供了理论保证。
课程简介: We consider the multiple linear regression problem, in a setting where some of the set of relevant features could be shared across the tasks. A lot of recent research has studied the use of L1 Lq norm block-regularizations with q and 1 for such (possibly) block-structured problems, establishing strong guarantees on recovery even under high-dimensional scaling where the number of features scale with the number of observations. However, these papers also caution that the performance of such block-regularized methods are very dependent on the to which the features are shared across tasks. Indeed they show that if the extent of overlap is less than a threshold, or even if parameter in the shared features are highly uneven, then block L1 Lq regularization could actually perform than simple separate elementwise L1 regularization. We are far away from a realistic multi-task setting: not only do the set of relevant features have to be exactly the same across tasks, but their values have to as well. Here, we ask the question: can we leverage support and parameter overlap when it exists, but not pay a penalty when it does not? Indeed, this falls under a more general question of whether we can model such which may not fall into a single neat structural bracket (all block-sparse, or all low-rank and so on). Here, we take a first step, focusing on developing a dirty model for the multiple regression problem. Our method uses a very simple idea: we decompose the parameters into two components and regularize these differently. We show both theoretically and empirically, our method strictly and noticeably outperforms both L1 and L1 Lq methods, over the entire range of possible overlaps. We also provide theoretical guarantees that the method performs well under high-dimensional scaling.
关 键 词: 计算机科学; 机器学习; 回归
课程来源: 视频讲座网
最后编审: 2020-06-08:yumf
阅读次数: 30