0


文本挖掘,信息与事实提取 (TMIFE)

Text Mining, Information and Fact Extraction (TMIFE)
课程网址: http://videolectures.net/russir08_moens_tmife/  
主讲教师: Marie-Francine Moens
开课单位: 鲁汶大学
开课时间: 2008-11-04
课程语种: 英语
中文简介:
社区(医疗信息学、安全、博客和新闻分析、商业信息分析、法律信息学等)。?尽管如此,今天它还是人类语言技术和信息检索的一个有点支离破碎的分支领域,其中(经常被遗忘的)基于旧式模式的IE和最新的机器学习技术的主题分散在各种会议和会议上(计算语言学、人工智能、机器学习、网络技术、语义计算)。本教程的目的是解释从手工制作的模式到学习的重要技术,特别是重点介绍它们如何融合在一起,以满足当前检索或挖掘信息的信息系统的需要,或根据提取的信息做出决策和解决问题的信息系统。这一统一的观点还需要对传统流水线系统架构的作用和最新的概率推理技术进行有价值的洞察。概率提取是将文本翻译成各种语义标签的一种方法,它与概率检索模型完美地结合在一起,概率检索模型自然地将表面文本特征和语义标签结合起来进行排序计算,其中最流行的是语言检索模型。最后,信息提取缓解了专家和答疑系统技术在更受限的学科领域中的知识获取瓶颈。最后,我们指出了一些新的挑战,其中包括对文本中复杂语义概念(如叙述性脚本,或医疗事故或竞争力等问题)的认识。由于许多技术和应用领域的协调方面,本教程将吸引具有不同背景的学生和研究人员。
课程简介: communities (medical informatics, security, blog and news analysis, business information analysis, legal informatics, etc.). ?Still, today it is a somewhat fragmented subfield of human language technologies and information retrieval where the themes of (often forgotten) old-style pattern-based IE and more recent machine learning techniques, as applied in medical informatics, opinion mining and blog extraction, are scattered in various conferences and sessions (computational linguistics, artificial intelligence, machine learning, Web technologies, semantic computing). The aim of this tutorial is to explain important technologies from handcrafted patterns to learning, and especially focus on how they blend together in order to suit the needs of current information systems that retrieve or mine information, or that make decisions and solve problems based on the extracted information. This unified perspective also entails valuable insights into the role of traditional pipelined system architectures and more recent probabilistic inference techniques. Probabilistic extraction, by which text is translated into a variety of semantic labels, pe"../slides/rfectly integrates with probabilistic retrieval models that naturally combine surface text features and semantic labels in ranking computations, among which are the popular language retrieval models. Finally, information extraction alleviates the knowledge acquisition bottleneck in expert and question answering systems technology that operate in more restricted subject domains. We conclude with some pointers to new challenges among which are the recognition of complex semantic concepts (e.g., narrative scripts, or issues such as medical malpractice or competitiveness) in texts. Because of the reconciling aspects of the many techniques and application domains, the tutorial will attract students and researchers with different backgrounds.
关 键 词: 机器学习; 人类语言技术; 计算机科学; 文本挖掘 ; 信息提取
课程来源: 视频讲座网
最后编审: 2020-06-02:毛岱琦(课程编辑志愿者)
阅读次数: 52