0


低通语义

Low-Pass Semantic
课程网址: http://videolectures.net/metaforum2012_pereira_semantic/  
主讲教师: Fernando C. N. Pereira
开课单位: 谷歌公司
开课时间: 2012-08-09
课程语种: 英语
中文简介:
自然语言处理的统计和机器学习方法的进步已经在信息检索,语音识别,机器翻译和信息提取方面产生了大量的方法和应用。然而,即使我们享受这些进步,我们也认识到我们的成功在很大程度上是巧妙利用语言结构和使用中的冗余的结果,允许我们的算法剔除一些我们可以在应用程序中工作的有用位。通过关注从文本中提取有限量信息的应用程序,在信息检索或语音识别中可以在很大程度上忽略诸如词序或句法结构之类的更精细结构。然而,通过忽略那些更精细的细节,我们的语言处理系统一直停留在“白痴学者”阶段,在那里他们可以找到一切但却无法理解的东西。未来十年的主要语言处理挑战是创建健全,准确,有效的方法,学习理解任何文本中讨论的主要实体和概念,以及主要的主张。通过朝着这个方向前进,我们的系统将提供更精确的问题答案,他们将验证和更新知识库,并且他们将在整个书面记录中追踪支持和反对索赔的论据。我将与我们最近的研究中的例子争论,我们需要更深层次的语言分析才能做到这一点。但我也会争辩说,即使我们对语言和计算语义的部分理解,通过(再次)利用大文本集合中的分布规律和冗余来学习有效的分析和理解规则,也可以做很多有用的事情。因此,低通语义:我们的科学知识远远不能映射全部意义,但通过整合来自整个网络的信号,我们开始听到一些有趣的曲调。
课程简介: Advances in statistical and machine learning approaches to natural language processing have yielded a wealth of methods and applications in information retrieval, speech recognition, machine translation, and information extraction. Yet, even as we enjoy these advances, we recognize that our successes are to a large extent the result of clever exploitation of redundancy in language structure and use, allowing our algorithms to eke out a few useful bits that we can put to work in applications. By focusing on applications that extract a limited amount of information from the text, finer structures such as word order or syntactic structure could be largely ignored in information retrieval or speech recognition. However, by ignoring those finer details, our language-processing systems have been stuck in an "idiot savant" stage where they can find everything but cannot understand anything. The main language processing challenge of the coming decade is to create robust, accurate, efficient methods that learn to understand the main entities and concepts discussed in any text, and the main claims made. By advancing in that direction, our systems will provide more precise answers to questions, they will verify and update knowledge bases, and they will trace arguments for and against claims throughout the written record. I will argue with examples from our recent research that we need deeper levels of linguistic analysis to do this. But I will also argue that it is possible to do much that is useful even with our very partial understanding of linguistic and computational semantics, by taking (again) advantage of distributional regularities and redundancy in large text collections to learn effective analysis and understanding rules. Thus low-pass semantics: our scientific knowledge is very far from being able to map the full spectrum of meaning, but by combining signals from the whole Web, we are starting to hear some interesting tunes.
关 键 词: 自然语言处理; 语言处理系统; 计算语义
课程来源: 视频讲座网
最后编审: 2019-05-16:cjy
阅读次数: 35