0


合奏蒙特卡罗计划:一个实证研究

Ensemble Monte-Carlo Planning: An Empirical Study
课程网址: http://videolectures.net/icaps2011_fern_ensemble/  
主讲教师: Alan Fern
开课单位: 俄勒冈州立大学
开课时间: 2011-07-21
课程语种: 英语
中文简介:
蒙特卡洛计划算法(例如UCT)通过在给定可用时间的情况下智能地扩展单个搜索树然后选择最佳根动作来选择每个决策时期的动作。最近的工作提供了证据,相反构建搜索树集合并根据加权投票做出决定可能是有利的。然而,这些先前的调查仅考虑了Go和Solitaire的应用领域,并且在所考虑的集合配置的范围内受到限制。在本文中,我们使用UCT算法在一组六个附加域中对集合蒙特卡罗规划进行了更详尽的实证研究。特别是,我们在并行和单核模型中评估了广泛的集合配​​置在空间和时间效率方面的优势。我们的结果表明,在单核模型中,并行时间模型和每单位空间的性能,集合是提高单位时间性能的有效方法。然而,与先前的孤立观察相反,我们没有发现重要证据表明集合在单核模型中提高了单位时间的性能。
课程简介: Monte-Carlo planning algorithms, such as UCT, select actions at each decision epoch by intelligently expanding a single search tree given the available time and then selecting the best root action. Recent work has provided evidence that it can be advantageous to instead construct an ensemble of search trees and to make a decision according to a weighted vote. However, these prior investigations have only considered the application domains of Go and Solitaire and were limited in the scope of ensemble configurations considered. In this paper, we conduct a more exhaustive empirical study of ensemble Monte-Carlo planning using the UCT algorithm in a set of six additional domains. In particular, we evaluate the advantages of a broad set of ensemble configurations in terms of space and time efficiency in both parallel and single core models. Our results demonstrate that ensembles are an effective way to improve performance per unit time given a parallel time model and performance per unit space in a single-core model. However, contrary to prior isolated observations, we did not find significant evidence that ensembles improve performance per unit time in a single-core model.
关 键 词: 蒙特卡洛计划算法; 搜索树; 单核模型
课程来源: 视频讲座网
最后编审: 2019-04-16:lxf
阅读次数: 52