0


挖掘社交网络以实现个性化电子邮件优先级

Mining Social Networks for Personalized Email Prioritizationp
课程网址: http://videolectures.net/kdd09_yoo_msnfp/  
主讲教师: Shinjae Yoo
开课单位: 卡内基梅隆大学
开课时间: 2009-09-14
课程语种: 英语
中文简介:
电子邮件是当今最流行的通信工具之一,解决电子邮件过载问题迫在眉睫。缓解电子邮件过载的一种好方法是根据每个用户的优先级自动确定接收消息的优先级。然而,由于隐私问题,完全个性化电子邮件优先级(PEP)的统计学习方法的研究很少,因为人们不愿意与研究团体分享个人信息和重要性判断。因此,在假设只有有限的培训示例可用,并且系统在培训和测试该用户的模型期间只能拥有每个用户的个人电子邮件数据的情况下,开发和评估PEP方法是很重要的。本文介绍了这种假设下的第一项研究(据我们所知)。具体而言,我们专注于分析个人社交网络以捕获用户群,并从特定用户的角度获得代表社交角色的丰富功能。我们还开发了一种新颖的半监督(转换)学习算法,该算法通过个人电子邮件网络中的消息和用户节点,将训练样例中的重要性标签传播到测试示例。这些方法共同使我们能够获得每个新电子邮件消息的丰富矢量表示,其包括电子邮件消息的标准特征(例如标题或正文中的单词,发送者和接收者ID等)和诱导的社交特征。来自邮件的发件人和收件人。使用富集向量表示作为SVM分类器中的输入来预测每个测试消息的重要性级别,我们在多用户数据集合的实验中获得了相对于基线系统(没有诱导社交特征)的显着性能改进。在我们的多用户数据收集实验中,我们获得了相对于基线系统(没有诱导社会特征)的显着性能改进:MAE的相对误差降低在微观平均中为31%,在宏观平均中为14%。
课程简介: Email is one of the most prevalent communication tools today, and solving the email overload problem is pressingly urgent. A good way to alleviate email overload is to automatically prioritize received messages according to the priorities of each user. However, research on statistical learning methods for fully personalized email prioritization (PEP) has been sparse due to privacy issues, since people are reluctant to share personal messages and importance judgments with the research community. It is therefore important to develop and evaluate PEP methods under the assumption that only limited training examples can be available, and that the system can only have the personal email data of each user during the training and testing of the model for that user. This paper presents the first study (to the best of our knowledge) under such an assumption. Specifically, we focus on analysis of personal social networks to capture user groups and to obtain rich features that represent the social roles from the viewpoint of a particular user. We also developed a novel semi-supervised (transductive) learning algorithm that propagates importance labels from training examples to test examples through message and user nodes in a personal email network. These methods together enable us to obtain an enriched vector representation of each new email message, which consists of both standard features of an email message (such as words in the title or body, sender and receiver IDs, etc.) and the induced social features from the sender and receivers of the message. Using the enriched vector representation as the input in SVM classifiers to predict the importance level for each test message, we obtained significant performance improvement over the baseline system (without induced social features) in our experiments on a multi-user data collection. We obtained significant performance improvement over the baseline system (without induced social features) in our experiments on a multi-user data collection: the relative error reduction in MAE was 31% in micro-averaging, and 14% in macro-averaging.
关 键 词: 电子邮件; 过载; 优先级
课程来源: 视频讲座网
最后编审: 2019-05-10:cwx
阅读次数: 40