
A Social Network Approach to Unsupervised Induction of Syntactic Clusters for Bengali
课程网址: http://videolectures.net/eccs07_choudhury_sna/  
主讲教师: Monojit Choudhury
开课单位: 微软公司
开课时间: 2007-12-14
课程语种: 英语
课程简介: In this paper we describe some experiments on fully unsupervised induction of parts-of-speech tags for Bengali words from a raw text corpus. For this purpose, we construct the network of 5000 most frequent Bengali words, where nodes are the types and the weight on the edge between two types is indicative of their distributional similarity and cluster the network using the Chinese Whispers algorithm [1]. We also propose the concept of tag-entropy that measures the cohesiveness of the word clusters in terms of the lexical categories of the constituent words.
关 键 词: 孟加拉语; Whispers算法; 标签熵
课程来源: 视频讲座网
最后编审: 2020-06-19:cxin
阅读次数: 73