境外开放课程导航---浙大学宁波理工学院图书馆

开课单位--阿尔伯塔大学

Strategy Evaluation in Extensive Games with Importance Sampling[具有重要抽样的广义博弈策略评价]
Michael Johanson(阿尔伯塔大学) Typically agent evaluation is done through Monte Carlo estimation. However, stochastic agent decisions and stochastic outcomes can make this approach ...
热度：30

Introduction to Reinforcement Learning and Bayesian learning[强化学习和贝叶斯学习简介]
Mohammad Ghavamzadeh(阿尔伯塔大学) Although Bayesian methods for Reinforcement Learning can be traced back to the 1960s (Howard's work in Operations Research), Bayesian methods have onl...
热度：163

Manifold-adaptive dimension estimation[流形自适应维数估计]
Amir-massoud Farahmand(阿尔伯塔大学) Intuitively, learning should be easier when the data points lie on a low-dimensional submanifold of the input space. Recently there has been a growing...
热度：71

Gaussian Process Temporal Difference[高斯过程时间差异]
Yaakov Engel(阿尔伯塔大学) Although Bayesian methods for Reinforcement Learning can be traced back to the 1960s (Howard's work in Operations Research), Bayesian methods have onl...
热度：104

Learning to Control an Octopus Arm with Gaussian Process Temporal Difference Methods[学习用高斯过程时间差分法控制八达通臂]
Yaakov Engel(阿尔伯塔大学) The Octopus arm is a highly versatile and complex limb. How the Octopus controls such a hyper-redundant arm (not to mention eight of them!) is as yet ...
热度：57

Toward the understanding of partial-monitoring games[了解部分监控游戏]
Csaba Szepesvári(阿尔伯塔大学) Partial monitoring games form a common ground for problems such as learning with expert advice, the multi-armed bandit problem, dynamic pricing, the d...
热度：57

<<<1 2 34 4/4

境外开放课程导航

一样的大学，不一样的视野