0


RL胶水和编解码胶水

RL Glue and Codecs Glue
课程网址: http://videolectures.net/mloss08_tanner_rlgcg/  
主讲教师: Brian Tanner
开课单位: 阿尔伯塔大学
开课时间: 2008-12-20
课程语种: 英语
中文简介:
RL Glue是一种用于评估强化学习算法的协议和软件实现。 Oursystem有助于比较替代算法,并且可以大大加快研究进度,因为UCI数据库加速了有监督机器学习的进展。由于强化学习的时间性质,为强化学习创建可比较的基准标记资源具有挑战性。强化学习代理与动态过程(环境)相互作用,从而产生观察和奖励。学习代理收到的观察和奖励取决于行动;训练数据不能简单地存储在文件中,因为它们在监督学习中。相反,强制学习代理和环境必须是相互作用的程序。 RL Glue代理和环境可以用Java,C / C,Matlab,Python和Lisp编写,并且可以在一台机器上运行,也可以通过Internet连接。在本次研讨会中,我们将介绍帮助塑造RL Glue的设计原则,并展示强化学习社区创建的一些有趣的扩展。
课程简介: RL-Glue is a protocol and software implementation for evaluating reinforcement learning algorithms. Our system facilitates the comparison of alternative algorithms and can greatly accelerate research progress as the UCI database has accelerated progress in supervised machine learning. Creating a comparable bench- marking resource for reinforcement learning is challenging because of the temporal nature of reinforcement learning. Reinforcement learning agents interact with a dynamic process (the environment) which gener- ates observations and rewards. The observations and rewards received by the learning agent depend on the actions; training data cannot simply be stored in a file as they are in supervised learning. Instead, the rein- forcement learning agent and environment must be interacting programs. RL-Glue agents and environments can be written in Java, C/C++, Matlab, Python, and Lisp and can all run on one machine, or can connect across the Internet. In this seminar, we will introduce the design principles that helped shape RL-Glue and demonstrate some of the interesting extensions that have been created by the reinforcement learning community.
关 键 词: 算法; 训练数据; 扩展
课程来源: 视频讲座网
最后编审: 2019-06-30:yuh
阅读次数: 31