
Poster: Knowledge as a Constraint on Uncertainty for Unsupervised Classification: A Study in Part-of-Speech Tagging
课程网址: http://videolectures.net/icml08_murray_kcu/  
主讲教师: Thomas J. Murray
开课单位: 南加利福尼亚大学
开课时间: 2008-08-11
课程语种: 英语
课程简介: This paper evaluates the use of prior knowledge to limit or bias the choices of a classifer during otherwise unsupervised training and classifcation. Focusing on effects in the uncertainty of the model's decisions, we quantify the contributions of the knowledge source as a reduction in the conditional entropy of the label distribution given the input corpus. Allowing us to compare diffrent sets of knowledge without annotated data, we find that label entropy is highly predictive of final performance for a standard Hidden Markov Model (HMM) on the task of part-of-speech tagging. Our results show too that even basic levels of knowledge, integrated as labeling constraints, have considerable effect on classification accuracy, in addition to more stable and effcient training convergence. Finally, for cases where the model's internal classes need to be interpreted and mapped to a de- sired label set, we find that, for constrained models, the requirements for annotated data to make quality assignments are greatly reduced.
关 键 词: 无人监督; 量化知识源; 标签熵
课程来源: 视频讲座网
最后编审: 2019-04-19:lxf
