0


自然语言处理中的近似推理

Approximate Inference in Natural Language Processing
课程网址: http://videolectures.net/nipsworkshops09_smith_ainlp/  
主讲教师: Noah Smith
开课单位: 卡内基梅隆大学
开课时间: 2010-01-19
课程语种: 英语
中文简介:
我将首先介绍解析的自然语言处理问题的理想化版本。我会肆无忌惮地建议大多数NLP可以简化为解析问题的变化。我将展示动态编程如何解决问题的理想化版本,包括计算解析树上的模式和边缘,利用关于自然语言句子结构的一些关键独立性假设。然后我将讨论两种近似推理方法,让我们构建更强大的解析模型。两者都没有强有力的理论保证,但两者都证明在实际NLP数据的实验中表现强劲。第一种方法建立在动态编程表示的基础上,结合最大乘积和和积乘积方法,近似地产生k个最佳解析和其余解析的剩余和,在合并违反通常独立假设的特征时很有用。实验用机器翻译的判别模型验证了该方法。第二种方法将解析问题实例转换为简洁的整数线性程序。然后使用众所周知的线性程序松弛来完成近似推断。这嵌入在一种新的在线学习算法中,该算法试图惩罚不可解释的分数解(因此在评估时推断成本)。我们表明,这种方法可以在七种语言中实现最先进的解析性能,提高了精确推理和近似推理的速度,并且没有显着的性能损失。
课程简介: I'll start out by presenting an idealized version of the natural language processing problem of parsing. I will brazenly suggest that most of NLP is reducible to variations on parsing problems. I'll show how dynamic programming solves the idealized version of the problem, both for calculating modes and marginals over parse trees, exploiting some key independence assumptions about the structure of natural language sentences. I will then discuss two approximate inference methods that let us build more powerful models of parsing. Neither comes with strong theoretical guarantees, but both are demonstrated to perform strongly in experiments on real NLP data. The first method builds on the dynamic programming representation, combining max-product and sum-product methods to produce, approximately, the k-best parses and a residual sum over the rest of the parses, useful when incorporating features that violate the usual independence assumptions. Experiments validate the approach with a discriminative model for machine translation. The second method turns a parsing problem instance into a concise integer linear program. Approximate inference is then accomplished using well-known linear program relaxation. This is embedded in a new online learning algorithm that tries to penalize uninterpretable fractional solutions (and therefore inference cost at evaluation time). We show that this approach leads to state-of-the-art parsing performance on seven languages, with improved speed for both exact and approximate inference and no significant performance loss.
关 键 词: 自然语言; 动态编程; 积乘积方法
课程来源: 视频讲座网
最后编审: 2019-09-07:lxf
阅读次数: 76