0


在线强化并发客户交互序列的学习

Online Reinforcement Learning from Concurrent Customer Interaction Sequences
课程网址: http://videolectures.net/onlinelearning2012_silver_reinforcement_...  
主讲教师: David Silver
开课单位: 伦敦大学学院
开课时间: 2013-05-28
课程语种: 英语
中文简介:
本文探讨了公司与许多客户交互的应用程序。公司有一个目标功能,例如最大化收入、客户满意度或客户忠诚度,这主要取决于公司和客户之间的交互顺序。此设置的一个关键方面是与不同客户的交互是异步和并行的。因此,必须从部分交互序列中在线学习,以便在随后与其他客户的交互中有效地吸收和应用从一个客户获得的信息。我将介绍在这种情况下加强学习的第一个框架,使用时间差分学习的异步变体从部分交互序列中有效地学习。
课程简介: This talk explores applications in which a company interacts with many customers. The company has an objective function, such as maximising revenue, customer satisfaction, or customer loyalty, which depends primarily on the sequence of interactions between company and customer. A key aspect ofthis setting is that interactions with different customers occur asynchronously and in parallel. As a result, it is imperative to learn online from partial interaction sequences, so that information acquired from one customer is efficiently assimilated and applied in subsequent interactions with other customers. I will present the first framework for reinforcement learning in this setting, using an asynchronous variant of temporal-difference learning to learn efficiently from partial interaction sequences.
关 键 词: 应用程序; 最大化收益; 交互序列; 在线学习
课程来源: 视频讲座网
最后编审: 2020-10-22:chenxin
阅读次数: 85