0


推理、注意力和记忆

Reasoning, Attention and Memory
课程网址: http://videolectures.net/deeplearning2016_chopra_attention_memory...  
主讲教师: Sumit Chopra
开课单位: 脸书公司
开课时间: 2016-08-23
课程语种: 英语
中文简介:

在过去几十年中,机器学习社区在解决文本分类、图像注释和语音识别等基本预测任务方面取得了巨大成功。然而,更深层次推理任务的解决方案仍然难以捉摸。实现更深入推理的一个关键组成部分是在推理过程中使用长期依赖和短期上下文。直到最近,大多数现有的机器学习模型都缺乏一种简单的方法来读取和写入(可能非常大的)长期记忆组件的一部分,并将其与推理无缝结合。要将记忆与推理结合起来,模型必须学习如何访问它,即对其记忆进行*注意力*。

然而,在过去一年左右的时间里,该领域取得了一些显着进展。开发注意力概念的模型在许多现实世界的任务(例如机器翻译和图像字幕)中显示出积极的结果。在探索不同形式的显式存储的计算模型方面也出现了激增。为此,我将介绍属于这一类别的一组模型。我将特别讨论记忆网络及其在各种任务中的应用,例如基于模拟故事的问答、完形填空式问答和对话建模。我还将讨论他们随后提出的变体,包括 End2End Memory Networks 和 Key Value Memory Networks。此外,我还将讨论神经图灵机和堆栈增强循环神经网络。在整个演讲中,我将讨论每个模型及其变体的优缺点。最后,我将讨论这些模型中仍然存在的不足以及潜在的未解决问题。

课程简介: The machine learning community has had great success in the last decades at solving basic prediction tasks such as text classification, image annotation and speech recognition. However, solutions to deeper reasoning tasks have remained elusive. A key component towards achieving deeper reasoning is the use of long term dependencies as well as short term context during inference. Until recently, most existing machine learning models have lacked an easy way to read and write to part of a (potentially very large) long-term memory component, and to combine this seamlessly with inference. To combine memory with reasoning, a model must learn how to access it, i.e. to perform *attention* over its memory. Within the last year or so, there has been some notable progress in this area however. Models developing notions of attention have shown positive results on a number of real-world tasks such as machine translation and image captioning. There has also been a surge in building models of computation which explore differing forms of explicit storage. Towards that end, I’ll shed some light on a set of models that fall in this category. In particular, I’ll discuss the Memory Networks, and its application to a wide variety of tasks, such as, question answering based on simulated stories, cloze style question answering, and dialog modeling. I’ll also talk about their subsequently proposed variants, including, End2End Memory Networks and Key Value Memory Networks. In addition, I will also talk about Neural Turing Machines, and Stack Augmented Recurrent Neural Networks. Throughout the talk I’ll discuss the advantages and disadvantages of each of these models and their variants. I will conclude with a discussion on what is still lacking among these models and potential open problems.
关 键 词: 机器学习模型; 记忆网络; Memory Networks
课程来源: 视频讲座网
数据采集: 2021-06-16:liyy
最后编审: 2021-06-16:liyy
阅读次数: 54