基底节函数的计算模型Computational Models of Basal Ganglia Function |
|
课程网址: | http://videolectures.net/mitworld_doya_cmbgf/ |
主讲教师: | Kenji Doya |
开课单位: | 冲绳科技研究院 |
开课时间: | 2010-08-09 |
课程语种: | 英语 |
中文简介: | 作为一名数学工程师,Kenji Doya从计算的角度探讨了描述最复杂的大脑机制的目标。他构建了涉及基底神经节网络结构的强化学习模型。他的努力被捕获并定量表示为概率,回归和算法。在本演示中,Doya涵盖了强化学习的基本概念,然后调查了最近十年对基础神经节电路控制自愿运动的组成部分的调查。主题包括:行动价值,行动候选人和涉及神经递质多巴胺的奖励预测;模型自由与基于模型的学习策略;血清素作为调节剂在复杂信息循环中的重要作用.Doya最近的研究是通过他称之为“网络啮齿动物”的机器人进行的。他作为一名大学生的梦想是“建立一个能够自行学习各种行为的机器人”。 “也就是说,计算机,而不是人类工程师,教导机器人移动。他在设计一个表现出情感的机器生物时完成了这项工作,其特征是“抑郁”,“冲动”,“贪婪”和“耐心”.Doya认为强化学习的“元参数”必须“适当调整......否则表现为你的学习非常非常糟糕。“迭代过程包括奖励本身的三个术语,基于行动选择的新状态的预期奖励,以及在之前状态中获得的奖励的记忆。在比较中,任何大于零的差异都可以用于学习。权衡:“没有痛苦,没有收获。”随着研究进展到结构特异性水平的提高,Doya认为基底神经节成分的“功能似乎存在空间隔离”。现在可以看到强化学习方面的专业化,例如纹状体的腹侧与背侧区域。在皮质基底神经节信息网络中也发现了分化:不是简单的闭环,而是进行不同神经操作的平行电通路。此外,神经调节器各自具有各自的任务。多巴胺编码奖励学习信号的时间差异误差。乙酰胆碱通过记忆更新动作和奖励来影响学习率。 Noradrenaline控制探索的宽度或随机性。 5-羟色胺涉及“暂时折扣”,评估某一特定行为是否值得预期奖励。 Doya提醒我们临床上“众所周知,抑郁症患者的5-羟色胺功能受损。”基底神经节组件和神经调节剂系统需要动态平衡。微妙的相互作用决定了学习,行动和情感状态的结果。 Doya的合成模型是人类行为的代理,他描述运动部分的计算框架最终对精神疾病和神经疾病具有治疗意义。 |
课程简介: | As a mathematical engineer, Kenji Doya approaches the goal of describing the most intricate brain mechanisms from a computational perspective. He constructs models of reinforcement learning involving the networked structures of the basal ganglia. His efforts are captured and expressed quantitatively as probabilities, regressions, and algorithms. In this presentation, Doya covers basic concepts of reinforcement learning, then surveys the last decade of inquiry into the components of the basal ganglia circuit governing voluntary motion. Among the topics: action values, action candidates, and reward prediction involving the neurotransmitter dopamine; model-free versus model-based learning strategies; and the essential role of serotonin as modulator in the complex information loop. Doya’s recent research is carried out via robots he calls “cyber rodents.” His dream as an undergraduate was to “build a robot that learns the variety of behaviors on its own.” That is, the computer, not the human engineer, teaches the robot to move. He accomplished this in designing a machine-creature exhibiting emotion-like attributes characterized as “depression,” “impulsivity,” “greed,” and “patience.” Doya believes the “metaparameters” of reinforcement learning must be “tuned appropriately…Otherwise the performance of your learning is very, very poor.” The iterative process involves three terms -- the reward itself, the expected reward for a new state based on choice of action, and memory of the reward gained in the previous state. In the comparison, any differential greater than zero can be exploited for learning. The tradeoff: “No pain, no gain.” As research advanced to increasing levels of structural specificity, Doya posited that “there seems to be spatial segregation in the function” of basal ganglia components. Specialization in aspects of reinforcement learning is now seen, for instance, in ventral versus dorsal areas of the striatum. Differentiation is also found in the cortico-basal ganglia information network: not a simple closed loop, but parallel electrical pathways conducting distinct neural operations. Further, the neuromodulators each have their respective missions. Dopamine encodes the temporal difference error -- the reward learning signal. Acetylcholine affects learning rate through memory updates of actions and rewards. Noradrenaline controls width or randomness of exploration. Serotonin is implicated in “temporal discounting,” evaluating if a given action is worth the expected reward. Doya reminds us that clinically “it is well known that the serotonin function is impaired in the depression patient.” The system of basal ganglia components and neuromodulators requires dynamic balancing. A delicate interplay determines outcomes for learning, actions, and affective states. Doya’s synthetic models are proxies for human behavior, and his computational framework describing the moving parts ultimately has therapeutic implications for psychiatric and neurological disorders. |
关 键 词: | 网络结构; 强化学习; 大脑机制 |
课程来源: | 视频讲座网 |
最后编审: | 2019-05-21:cwx |
阅读次数: | 29 |