
Regularization and Feature Selection in Least Squares Temporal-Difference Learning
课程网址: http://videolectures.net/icml09_kolter_rfsl/  
主讲教师: J. Zico Kolter
开课单位: 卡内基梅隆大学
开课时间: 2009-09-17
课程语种: 英语
课程简介: We consider the task of reinforcement learning with linear value function approximation. Temporal difference algorithms, and in particular the Least-Squares Temporal Difference (LSTD) algorithm, provide a method for learning the parameters of the value function, but when the number of features is large this algorithm can over-fit to the data and is computationally expensive. In this paper, we propose a regularization framework for the LSTD algorithm that overcomes these difficulties. In particular, we focus on the case of l1 regularization, which is robust to irrelevant features and also serves as a method for feature selection. Although the l1 regularized LSTD solution cannot be expressed as a convex optimization problem, we present an algorithm similar to the Least Angle Regression (LARS) algorithm that can efficiently compute the optimal solution. Finally, we demonstrate the performance of the algorithm experimentally.
关 键 词: 线性值函数; 最小二乘时间差分算法; 正则化框架
课程来源: 视频讲座网
最后编审: 2020-06-22:chenxin
阅读次数: 110