首页其它
   首页自然科学
   首页生物学
0


在奖励学习期间监测多巴胺释放

Monitoring Dopamine Release During Reward Learning
课程网址: http://videolectures.net/mitworld_phillips_mdrdrl/  
主讲教师: Paul E. M. Phillips
开课单位: 麻省理工学院
开课时间: 2010-08-09
课程语种: 英语
中文简介:
在学习的过程中,我们“有时会做出更多的审慎选择,有时会做出更多内心的选择,”保罗菲利普斯说。这些是“我们直观地知道的语义术语”,科学家们已经精通动物和人类创造任务,展示了这些不同类型的学习(分析反思与冲动反身)如何发挥作用。菲利普斯一直试图在这种不同的学习过程中跟踪多巴胺释放(与学习相关的神经递质)。基于模型的学习系统将刺激与奖励配对,并且在训练之后,主体创建“允许世界的模型表示”,允许它可以预测刺激后奖励的出现。相比之下,模型自由学习系统“使用一个更新的一维值”,因为主体积累了经验,并开始权衡期望与实际交付的奖励之间的差异。菲利普斯的一项研究涉及植入电极测量多巴胺在参与不同类型学习任务的大鼠纹状体中释放。菲利普斯的研究表明,在经典的调理任务中,多巴胺的释放会随时间而变化。起初,多巴胺仅在奖励后出现峰值,但随着时间的推移,动物会知道它会在刺激后获得奖励(一个轻微的提示),很快,提示就会引发多巴胺反应。菲利普斯还发现,纹状体的两个不同部分在训练中的不同点处增加了多巴胺。 “在思考这些大脑区域所涉及的内容方面,这是非常有趣的,特别是背侧纹状体中习惯的概念。”一些研究结果表明,在这些学习过程中,所有的多巴胺神经元都应该被射击。但菲利普斯表示,这并不能解释为什么“我们在大脑的(一)部分获得信号而在另一部分没有。”菲利普斯推测多巴胺的“弓形克星乙酰胆碱”可能会抑制纹状体某些部位的多巴胺释放。在强化学习的特定阶段。菲利普斯一直在选择性繁殖的大鼠系列,它们似乎表现出行为和多巴胺释放模式,提示两种不同的学习策略。他的结论是“刺激和奖励之间的关联可以通过具有不同计算需求的多种策略来学习”,并且他不相信动物“被锁定在一种或另一种策略中”。
课程简介: In the process of learning, we “sometimes make more deliberative choices, and sometimes make more visceral ones,” says Paul Phillips. These are “semantic terms we intuitively know,” and scientists have become well-versed in creating tasks for animals and humans that demonstrate how these different kinds of learning (analytical- reflective vs. impulsive -reflexive) play out. Phillips has been trying to track dopamine release (a neurotransmitter linked to learning) in such divergent learning processes. The model-based learning system pairs a stimulus with a reward, and after training, a subject creates a “model representation of the world” that allows it to predict the appearance of the reward after the stimulus. In contrast, the model-free system of learning “uses a one dimensional value that gets updated” as the subject accumulates experience and begins to weigh the difference between expectation and the reward that’s actually delivered. One of Phillips’ studies involved implanting electrodes for measuring dopamine release in the striatum of rats participating in different types of learning tasks. Phillips work shows time-dependent changes in the release of dopamine during classic conditioning tasks. At first the dopamine spikes only after the reward, but over time, the animal learns it will receive the reward after the stimulus (a light cue), and soon, the cue alone elicits the dopamine response. Phillips has also found that two distinct parts of the striatum register increased dopamine at different points in the training. “This is quite interesting in terms of thinking about what these brain regions have been implicated in, and specifically the idea of habits in the dorsal striatum.” The results of some research suggest that during these learning processes, all the dopamine neurons should be firing. But Phillips says this doesn’t explain why “we’re getting signals in (one) part of the brain but not in the other.” Phillips speculates that dopamine’s “arch nemesis acetylcholine” might be inhibiting dopamine release in certain parts of the striatum during specific phases of reinforcement learning. Phillips has also been working with selectively bred lines of rats, which seem to exhibit behaviors, and dopamine release patterns, suggestive of two distinct learning strategies. He concludes that “associations between stimuli and rewards can be learned through multiple strategies with different computational demands,” and he doesn’t believe that animals “are locked into one strategy or another.”
关 键 词: 多巴胺; 学习策略; 测试
课程来源: 视频讲座网
最后编审: 2019-06-11:yuh
阅读次数: 35