0


众包的分类

Crowdsourcing Taxonomies
课程网址: http://videolectures.net/eswc2012_karampinas_crowdsourcing/  
主讲教师: Dimitris Karampinas
开课单位: 帕特拉斯大学
开课时间: 2012-07-04
课程语种: 英语
中文简介:
分类法是组织,评估和搜索Web内容的有用机制。因此,许多流行的Web应用程序类别,从产品分类,类似产品比较定价,本地化服务,到垂直或企业搜索,都使用它们。然而,专家手动生成和维护是一个耗时且繁琐的过程,通常会导致依赖于平台和静态的词汇表。因此,许多研究目前正在关注更灵活和动态的方法来开发它们,例如社会媒体领域中的民众分类法的巨大兴趣就是证明。我们提出了一种构建分类法的新方法。我们的想法源于人类参与的增加以及提供标签和注释网络内容的愿望(例如,在社交媒体和产品分类应用中)。我们以明确的结构信息的形式定义人类用户所需的输入;也就是说,概念之间的超类型 - 子类型关系。人类对这种关系有很好的理解。通过这种方式,我们通过共同的注释实践,收集用户关于他们共享和访问的(分类)Web内容的集体智慧。我们进一步定义了众包分类法构建算法应该基于的原则。我们证明了由此产生的问题是NP-Hard。我们提供启发式算法和相关优化,聚合人类输入,解决冲突输入,并产生分类。我们的算法评估基于真实世界的众包实验(真实用户提供此类信息)和现实世界的分类法。
课程简介: Taxonomies are a useful mechanism to organize, evaluate, and search web content. As such, many popular classes of web applications, from product categorization, similar-product comparative pricing, localized services, to vertical or enterprise search, utilize them. However, their manual generation and maintenance by experts is a time-costly and cumbersome procedure, often resulting in platform-dependent and static vocabularies. Hence lots of research has been focusing currently on more flexible and dynamic methods to develop them, as evidenced for example by the huge interest of folksonomies within the social media realm. We propose a new approach for constructing taxonomies. Our idea stems from the increased human involvement and desire to provide tags and annotate web content (e.g., in social media and product categorization applications). We define the required input from human users in the form of explicit structural information; that is, supertype-subtype relationships between concepts. Humans have a good understanding of such relationships. In this way, we harvest, via common annotation practices, the collective wisdom of users with respect to the (categorization of) web content they share and access. We further define the principles upon which crowdsourced taxonomy construction algorithms should be based. We show that the resulting problem is NP-Hard. We provide heuristic algorithms and relevant optimizations that aggregate human input, resolving conflicting input, and produce taxonomies. Our algorithm's evaluation is based on real-world crowdsourcing experiments (where real users provide such information) and on real-world taxonomies.
关 键 词: 大众分类法; 启发式算法; 众包试验
课程来源: 视频讲座网
最后编审: 2020-06-20:zyk
阅读次数: 87