0


从Web语义IR用户的真实世界中的活动的自动建模

Automatic Modeling of User's Real World Activities from the Web for Semantic IR
课程网址: http://videolectures.net/www2010_fukazawa_amu/  
主讲教师: Yusuke Fukazawa
开课单位: NTT DOCOMO公司
开课时间: 2010-05-17
课程语种: 英语
中文简介:
我们一直在开发一个基于任务的服务导航系统, 该系统为用户提供与用户希望执行的任务相关的服务。该系统允许用户在人类专家开发的任务模型中具体化自己的要求。在本研究中, 为了降低收集各种活动的成本, 我们从网络上研究了用户真实世界活动的自动建模。为了快速提取尽可能多的活动, 并进行召回, 我们调查了提取的适当数量的内容和资源。我们的结果表明, 我们不需要检查整个网络, 这是太耗时;需要从博客内容中获得数量有限的搜索结果 (例如, 在 21, 000, 000个搜索结果中获得900个搜索结果)。此外, 为了估计活动模型中存在的具有最低错误率的层次关系, 我们提出了一种将活动表示分为名词部分和动词部分的方法, 并计算相互信息他们之间。结果表明, 该方法可以捕获近 8 0% 的层次关系。
课程简介: We have been developing a task-based service navigation system that offers to the user services relevant to the task the user wants to perform. The system allows the user to concretize his/her request in the task-model developed by human-experts. In this study, to reduce the cost of collecting a wide variety of activities, we investigate the automatic modeling of users’ real world activities from the web. To extract the widest possible variety of activities with high precision and recall, we investigate the appropriate number of contents and resources to extract. Our results show that we do not need to examine the entire web, which is too time consuming; a limited number of search results (e.g. 900 from among 21,000,000 search results) from blog contents are needed. In addition, to estimate the hierarchical relationships present in the activity model with the lowest possible error rate, we propose a method that divides the representation of activities into a noun part and a verb part, and calculates the mutual information between them. The result shows almost 80% of the hierarchical relationships can be captured by the proposed method.
关 键 词: 计算机科学; 采集文本; 自动建模
课程来源: 视频讲座网
最后编审: 2020-06-12:yumf
阅读次数: 49