开课单位--阿尔伯塔大学
<<<1234 4/4

31
Strategy Evaluation in Extensive Games with Importance Sampling[具有重要抽样的广义博弈策略评价]
  Michael Johanson(阿尔伯塔大学) Typically agent evaluation is done through Monte Carlo estimation. However, stochastic agent decisions and stochastic outcomes can make this approach ...
热度:30

32
Introduction to Reinforcement Learning and Bayesian learning[强化学习和贝叶斯学习简介]
  Mohammad Ghavamzadeh(阿尔伯塔大学) Although Bayesian methods for Reinforcement Learning can be traced back to the 1960s (Howard's work in Operations Research), Bayesian methods have onl...
热度:163

33
Manifold-adaptive dimension estimation[流形自适应维数估计]
  Amir-massoud Farahmand(阿尔伯塔大学) Intuitively, learning should be easier when the data points lie on a low-dimensional submanifold of the input space. Recently there has been a growing...
热度:71

34
Gaussian Process Temporal Difference[高斯过程时间差异]
  Yaakov Engel(阿尔伯塔大学) Although Bayesian methods for Reinforcement Learning can be traced back to the 1960s (Howard's work in Operations Research), Bayesian methods have onl...
热度:104

35
Learning to Control an Octopus Arm with Gaussian Process Temporal Difference Methods[学习用高斯过程时间差分法控制八达通臂]
  Yaakov Engel(阿尔伯塔大学) The Octopus arm is a highly versatile and complex limb. How the Octopus controls such a hyper-redundant arm (not to mention eight of them!) is as yet ...
热度:57

36
Toward the understanding of partial-monitoring games[了解部分监控游戏]
  Csaba Szepesvári(阿尔伯塔大学) Partial monitoring games form a common ground for problems such as learning with expert advice, the multi-armed bandit problem, dynamic pricing, the d...
热度:57
<<<1234 4/4