0


文本表示 - 从字符到逻辑

Representing Text – from characters to logic
课程网址: http://videolectures.net/esslli2011_mladenic_characters/  
主讲教师: Dunja Mladenić
开课单位: 约瑟夫·斯特凡学院
开课时间: 2011-09-12
课程语种: 英语
中文简介:
人们用自然语言和文字来表达自己。为了进行文本处理,文本可以用不同的方式表示,从简单的字符到以逻辑形式从文本中获取知识。自然语言的一个关键特性是编码信息和所用结构的冗余性。因此,不同的技术可以从文本中提取不同方面的信息。它们从简单的技术,如字符计数,到更复杂的技术,如线性代数,到利用文本结构方面的高级技术。这些技术中的许多都能提供有用的东西并解决某人的问题。这些问题的例子有:语言识别(用字符计数解决)、文档分类(用线性代数方法解决)、问题回答(通常用浅显的语言方法解决)和推理(通常用逻辑解决)。本文将从文本自动处理的角度来介绍不同的文本表示。在下半部分,我们将看一看一些基于机器学习方法的研究结果,我们将看到相应原型系统的演示。
课程简介: People use natural language and write texts to express themselves. For the purpose of text processing, text can be represented in different ways ranging from simply characters to capturing knowledge from the text in a form of logic. One of the key properties of natural languages is redundancy in the encoded information and the structure used. As a consequence, different techniques can extract different aspects of information from text. They range from simple techniques, such as character counting, to more sophisticated, such as linear algebra, to the advanced techniques which exploit the structural aspects of text. Many of these techniques deliver something useful and solve somebody’s problem. Examples of such problems are: language identification (solved with character counting), document categorization (solved with linear algebra methods), question-answering (solved typically with shallow linguistic methods), and reasoning (solved typically using logic). The talk will present different text representations from the view of automatic text processing. In the second half of the talk we will take a look at some research results based on using machine learning methods and we will see demos of the corresponding prototype systems.
关 键 词: 文本处理; 语言; 编码; 机器学习
课程来源: 视频讲座网
最后编审: 2020-07-23:yumf
阅读次数: 36