开课单位--阿尔伯塔大学
31
32
33
34
35
36
![](functions/showpic.php?filename=2019041809130163.png)
Strategy Evaluation in Extensive Games with Importance Sampling[具有重要抽样的广义博弈策略评价]
Michael Johanson(阿尔伯塔大学) Typically agent evaluation is done through Monte Carlo estimation. However, stochastic agent decisions and stochastic outcomes can make this approach ...
热度:30
Michael Johanson(阿尔伯塔大学) Typically agent evaluation is done through Monte Carlo estimation. However, stochastic agent decisions and stochastic outcomes can make this approach ...
热度:30
![](functions/showpic.php?filename=2019041708401258.png)
Introduction to Reinforcement Learning and Bayesian learning[强化学习和贝叶斯学习简介]
Mohammad Ghavamzadeh(阿尔伯塔大学) Although Bayesian methods for Reinforcement Learning can be traced back to the 1960s (Howard's work in Operations Research), Bayesian methods have onl...
热度:163
Mohammad Ghavamzadeh(阿尔伯塔大学) Although Bayesian methods for Reinforcement Learning can be traced back to the 1960s (Howard's work in Operations Research), Bayesian methods have onl...
热度:163
![](functions/showpic.php?filename=2019041708354718.png)
Manifold-adaptive dimension estimation[流形自适应维数估计]
Amir-massoud Farahmand(阿尔伯塔大学) Intuitively, learning should be easier when the data points lie on a low-dimensional submanifold of the input space. Recently there has been a growing...
热度:71
Amir-massoud Farahmand(阿尔伯塔大学) Intuitively, learning should be easier when the data points lie on a low-dimensional submanifold of the input space. Recently there has been a growing...
热度:71
![](functions/showpic.php?filename=2019041708335072.png)
Gaussian Process Temporal Difference[高斯过程时间差异]
Yaakov Engel(阿尔伯塔大学) Although Bayesian methods for Reinforcement Learning can be traced back to the 1960s (Howard's work in Operations Research), Bayesian methods have onl...
热度:104
Yaakov Engel(阿尔伯塔大学) Although Bayesian methods for Reinforcement Learning can be traced back to the 1960s (Howard's work in Operations Research), Bayesian methods have onl...
热度:104
![](functions/showpic.php?filename=2019041504521177.png)
Learning to Control an Octopus Arm with Gaussian Process Temporal Difference Methods[学习用高斯过程时间差分法控制八达通臂]
Yaakov Engel(阿尔伯塔大学) The Octopus arm is a highly versatile and complex limb. How the Octopus controls such a hyper-redundant arm (not to mention eight of them!) is as yet ...
热度:57
Yaakov Engel(阿尔伯塔大学) The Octopus arm is a highly versatile and complex limb. How the Octopus controls such a hyper-redundant arm (not to mention eight of them!) is as yet ...
热度:57
![](functions/showpic.php?filename=2019041411222269.png)
Toward the understanding of partial-monitoring games[了解部分监控游戏]
Csaba Szepesvári(阿尔伯塔大学) Partial monitoring games form a common ground for problems such as learning with expert advice, the multi-armed bandit problem, dynamic pricing, the d...
热度:57
Csaba Szepesvári(阿尔伯塔大学) Partial monitoring games form a common ground for problems such as learning with expert advice, the multi-armed bandit problem, dynamic pricing, the d...
热度:57