开课单位--SequeL实验室
1 1/1

1
A Relative Exponential Weighing Algorithm for Adversarial Utility-based Dueling Bandits[一种基于效用的相对指数加权算法]
  Pratik Gajane(SequeL实验室) We study the K-armed dueling bandit problem which is a variation of the classical Multi-Armed Bandit (MAB) problem in which the learner receives only ...
热度:28
1 1/1