0


一种用于文本数据可视化的部分监督度量多维尺度缩放算法

A Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization
课程网址: http://videolectures.net/ida07_martin_merino_apsmmsa/  
主讲教师: Manuel Martin-Merino Acera
开课单位: 萨拉曼卡天主教大学
开课时间: 2007-10-08
课程语种: 英语
中文简介:
多维缩放算法(MDS)允许我们直观地可视化高维对象关系。MDS算法的一个有趣的应用是可视化文本数据库中文档或术语之间的语义关系。然而,文献中提出的MDS算法具有很低的识别能力。算法的无监督性质和维度的诅咒有利于地图中不同主题之间的重叠。考虑到许多文本集合经常为一小部分文档提供分类,可以克服这个问题。在本文中,我们定义了新的半监督措施,以更好地反映语义类的文本收集考虑到一个子集的先验分类。接下来,TorgersonMDS算法将不同之处合并在一起,以改进地图中主题之间的分离。实验结果表明,该模型优于已知的无监督方案。
课程简介: Multidimensional Scaling Algorithms (MDS) allow us to visualize high dimensional object relationships in an intuitive way. An interesting application of the MDS algorithms is the visualization of the semantic relations among documents or terms in textual databases. However, the MDS algorithms proposed in the literature exhibit a low discriminant power. The unsupervised nature of the algorithms and the ’curse of dimensionality’ favor the overlapping among different topics in the map. This problem can be overcome considering that many textual collections provide frequently a categorization for a small subset of documents. In this paper we define new semi-supervised measures that reflect better the semantic classes of the textual collection considering the a priori categorization of a subset of documents. Next the dissimilarities are incorporated into the Torgerson MDS algorithm to improve the separation among topics in the map. The experimental results show that the model proposed outperforms well known unsupervised alternatives.
关 键 词: 多维垢算法; 语义关系; 半监督措施
课程来源: 视频讲座网
最后编审: 2020-06-15:heyf
阅读次数: 46