0


支持向量机在语音标记中的应用

A Support Vector Machine Approach to Dutch Part-of-Speech Tagging
课程网址: http://videolectures.net/ida07_poel_asvma/  
主讲教师: Mannes Poel
开课单位: 特温特大学
开课时间: 2007-10-08
课程语种: 英语
中文简介:
部分语音标记,将词性分配给给定使用上下文中的单词,是许多处理自然语言的系统中的基本技术。本文描述了一种使用支持​​向量机委员会对荷兰语口头语注释的大型语料库进行监督训练的方法。特别注意将大数据集分解为常见,不常见和未知单词的部分。这不仅解决了数据量造成的空间问题,还改善了标记时间。结果标记器在准确度方面的性能为97.54%,这是非常好的,其中标记器的速度相当好。
课程简介: Part-of-Speech tagging, the assignment of Parts-of-Speech to the words in a given context of use, is a basic technique in many systems that handle natural languages. This paper describes a method for supervised training of a Part-of-Speech tagger using a committee of Support Vector Machines on a large corpus of annotated transcriptions of spoken Dutch. Special attention is paid to the decomposition of the large data set into parts for common, uncommon and unknown words. This does not only solve the space problems caused by the amount of data, it also improves the tagging time. The performance of the resulting tagger in terms of accuracy is 97.54 %, which is quite good, where the speed of the tagger is reasonably good.
关 键 词: 语音标记; 向量机; 口头语注释
课程来源: 视频讲座网
最后编审: 2019-04-27:lxf
阅读次数: 58