0


Wiki MID:一个非常大的Twitter用户多域兴趣数据集,映射到维基百科

Wiki-MID: a very large Multi-domain Interests Dataset of Twitter users with mappings to Wikipedia
课程网址: http://videolectures.net/iswc2018_faralli_wiki_mid_interests/  
主讲教师: Stefano Faralli
开课单位: 罗马大学
开课时间: 2018-11-22
课程语种: 英语
中文简介:
本文介绍了Wiki MID,一个符合LOD的多域兴趣数据集,用于训练和测试推荐系统,以及从英语和意大利语的Twitter消息创建数据集的方法。我们的英语数据集包括每个用户在音乐、书籍、电影、名人、体育、政治等方面的平均90个多域偏好,在2017年的六个月内,大约有50万用户进行了跟踪。偏好要么是从使用Spotify、Goodreads和其他类似内容共享平台的用户的消息中提取的,要么是从他们的“话题”朋友(即,关注者代表的是兴趣,而不是同龄人之间的社交关系)中诱导的。此外,首选项目与描述它们的维基百科文章相匹配。我们数据集的这一独特功能提供了一种对首选项进行分类的方法,利用了与维基百科相关的可用语义资源,如维基百科分类图、DBpedia、BabelNet等。
课程简介: This paper presents Wiki-MID, a LOD compliant multi-domain interests dataset to train and test Recommender Systems, and the methodology to create the dataset from Twitter messages in English and Italian. Our English dataset includes an average of 90 multi-domain preferences per user on music, books, movies, celebrities, sport, politics and much more, for about half million users traced during six months in 2017. Preferences are either extracted from messages of users who use Spotify, Goodreads and other similar content sharing platforms, or induced from their "topical" friends, i.e., followees representing an interest rather than a social relation between peers. In addition, preferred items are matched with Wikipedia articles describing them. This unique feature of our dataset provides a mean to categorize preferred items, exploiting available semantic resources linked to Wikipedia such as the Wikipedia Category Graph, DBpedia, BabelNet and others.
关 键 词: Wiki MID; 多域兴趣数据集; 训练和测试推荐系统; 首选项进行分类; 可用语义资源
课程来源: 视频讲座网
数据采集: 2023-01-14:cyh
最后编审: 2023-01-14:cyh
阅读次数: 49