
Researching Dictionary Needs of Language Users Through Social Media: A Semi-Automatic Approach
课程网址: http://videolectures.net/euralex2018_cibej_social_media/  
主讲教师: Jaka Čibej
开课单位: 卢布尔雅那大学
开课时间: 2018-07-27
课程语种: 英语

在过去的几十年中,随着数字媒体的兴起,许多与语言有关的讨论都出现在各种论坛和社交媒体上,例如Facebook,用户可以在其中参加一个共同的兴趣小组,讨论语言的使用,问题和资源。这些组中的帖子由语言用户制定,是对特定语言使用中断的真实回应,并为研究语言问题提供了经验出发点。我们提出了一种自动方法来从与语言相关的Facebook群组中提取问题,并以连续的步骤描述该过程。我们还将解决版权,隐私和道德约束的问题,并提出克服这些问题的方法。我们针对与斯洛文尼亚语相关的两个Facebook组:Za vsajpribližnopravilno raboslovenščine和Društvoljubiteljskih pravopisarjev在斯洛伐克语中的情况介绍提取方法。两组都允许用户讨论与语言相关的问题,并在社区中找到他们的问题的答案。我们从这些组中提取的第一篇文章产生了大约1,900个帖子(由大约500个用户撰写)和13,000个评论(由900多个用户发布),提供了足够的材料,可以进行分析以揭示用户最常遇到的语言问题。

课程简介: With the rise of digital media in the last decades, many language-related discussions have found home on various fora and social media such as Facebook, where users can participate in a shared-interest group to discuss language use, problems and resources. The posts in these groups are formulated by language users as a genuine response to a specific disruption in language use and offer an empirical starting point for studying language problems. We propose an automatic approach to extracting questions from language-related Facebook groups and describe the procedure in consecutive steps. We also address the issues of copyright, privacy and ethical constraints, and propose ways to overcome them. We present the extraction method on a case of two Slovene language-related Facebook groups: Za vsaj približno pravilno rabo slovenščine and Društvo ljubiteljskih pravopisarjev in slovničarjev. Both groups allow users to discuss language-related problems and find answers to their questions within the community. Our first extraction from these groups yielded approximately 1,900 posts (written by approximately 500 users) and 13,000 comments (posted by more than 900 users), providing ample material that can be analyzed to reveal the users’ most frequent language problems.
关 键 词: 语言; 词典; 半自动方法
课程来源: 视频讲座网
数据采集: 2020-11-02:yxd
最后编审: 2020-11-03:zyk
阅读次数: 59