0


空间索引动态规划:学习跟踪轨迹

Space-indexed Dynamic Programming: Learning to Follow Trajectories
课程网址: http://videolectures.net/icml08_kotler_sid/  
主讲教师: J. Zico Kolter
开课单位: 卡内基梅隆大学
开课时间: 2008-08-04
课程语种: 英语
中文简介:
我们认为学习的任务是准确地遵循车辆(例如汽车或直升机)中的轨迹。许多动态编程算法,如差分动态规划(DDP)和动态规划策略搜索(PSDP),可以有效地计算这些任务的非静态策略,这些策略通常非常适合轨迹跟踪,因为它们可以轻松生成不同的控制在不同时间采取行动以遵循轨迹。但是,这些算法的一个弱点是它们的策略是时间索引的,因为它们根据当前时间应用不同的策略。这是有问题的,因为1)当前时间可能与我们沿轨迹的位置不完全对应,以及2)未来状态的不确定性可能阻止这些算法找到任何好的策略。在本文中,我们提出了一种空间索引动态规划方法,克服了这些困难。我们首先展示如何根据空间索引变量(即,沿着我们的轨迹走多远)而不是作为时间的函数来重写动态系统。然后,我们使用这些空间索引动态系统来导出DDP和PSDP算法的空间索引版本。最后,我们证明了这些算法在仿真和实际系统中的各种控制任务上都表现良好。
课程简介: We consider the task of learning to accurately follow a trajectory in a vehicle such as a car or helicopter. A number of dynamic programming algorithms such as Differential Dynamic Programming (DDP) and Policy Search by Dynamic Programming (PSDP), can efficiently compute non-stationary policies for these tasks --- such policies in general are well-suited to trajectory following since they can easily generate different control actions at different times in order to follow the trajectory. However, a weakness of these algorithms is that their policies are time-indexed, in that they apply different policies depending on the current time. This is problematic since 1) the current time may not correspond well to where we are along the trajectory and 2) the uncertainty over future states can prevent these algorithms from finding any good policies at all. In this paper we propose a method for space-indexed dynamic programming that overcomes both these difficulties. We begin by showing how a dynamical system can be rewritten in terms of a spatial index variable (i.e., how far along the trajectory we are) rather than as a function of time. We then use these space-indexed dynamical systems to derive space-indexed version of the DDP and PSDP algorithms. Finally, we show that these algorithms perform well on a variety of control tasks, both in simulation and on real systems.
关 键 词: 动态编程算法; 非静态策略; 空间索引变量
课程来源: 视频讲座网
最后编审: 2020-06-22:chenxin
阅读次数: 62