0


数据网络:到目前为止我们做得怎么样?

The web of data: how are we doing so far?
课程网址: http://videolectures.net/www2021_simperl_web_of_data/  
主讲教师: Elena Simperl
开课单位: 伦敦国王学院
开课时间: 2021-05-03
课程语种: 英语
中文简介:
纵观其历史,网络塑造了我们对数据的理解和互动。在人工智能时代,这主要是它通过无数相互关联的应用程序和用户社区帮助创建、发现、组织和利用的数据。这些数据有多种形式:我们在线时留下的数字痕迹、用户生成的内容、科学家和政府发布的数据集,或在众包平台上制作的用于训练机器学习算法的标签。数据网络本应通过链接、元数据、共享词汇表和标准化技术将所有信息整合在一起。自2007年发布第一个包含12个数据集的链接开放数据云以来,我们已经取得了长足的进步。但我们也遇到了路障,我们仍然需要克服。在向开发者开放不同的数据流方面,已经进行了巨大的投资,然而出版商却难以证明这些投资的影响,并变得可持续。在网上寻找和理解数据一如既往地至关重要,尤其是随着越来越多的工作依赖于它。事实证明,许多数据都存在缺陷,侵蚀了我们的社会纽带和对机构的信任。尽管知识图表有所增加,但数据孤岛比以往任何时候都更加普遍,各国政府正在以数字主权的名义为数据流建立虚拟边界。在本次演讲中,我将介绍最近的研究,这些研究为我们深入了解当今数据网络的状态提供了一些见解。就像网络代表着从搜索到购物再到社交网络的服务和平台的混合体一样,数据网络是一个具有多个方面的概念——我们需要对这些不同方面进行分析,以了解我们取得了多大的进步,以及真正的挑战在哪里。网络上的数据不仅仅是由链接的开放数据云推动的标准和协议;从更广泛的角度来看,它相当于嵌入在文档中的web表的(稀疏链接的)图形,相当于数以百万计的各种格式的在线数据集,也相当于以可访问的方式呈现数据的图表。数据网是一种发布和重用数据的机制,是一个社交网络、一个市场,也是一个帮助培训世界上人工智能的平台,所有这些都或多或少地受到技术政治的影响。我将在对开放数据门户、数据社区和众包数据集的研究的支持下,讨论这些不同的解释,并深入探讨技术、用户体验、创新和政策问题及其对该领域当前和未来发展的影响。
课程简介: Throughout its history, the web has shaped our understanding and interactions with data. In the age of AI, this is mostly the data that it helps create, find, organise, and utilise, through its myriad of interconnected applications and user communities. This data takes many forms: digital traces we leave behind while being online, user-generated content, datasets published by scientists and government, or labels produced on crowdsourcing platforms to train machine learning algorithms. The web of data was supposed to bring it all together through links, metadata, shared vocabularies, and standardised technologies. We’ve come a long way since the first linked open data cloud was published in 2007 with 12 datasets. But we’ve also encountered road blockers that we’re still to overcome. Huge investments have been made in opening different streams of data to developers, yet publishers struggle to show evidence of the impact of these investments and become sustainable. Finding and making sense of data online is as critical as it has ever been, especially as more and more jobs come to rely on it. Lots of data turns out to be flawed, eroding our social bonds and trust in institutions. Despite a rise in knowledge graphs, data siloes are more common than ever and governments are building virtual borders to data flows in the name of digital sovereignty. In this talk I will present recent research that provide insights into the state of the web of data today. Just like the web stands for a melange of services and platforms, from search to shopping to social networks, the web of data is a concept with multiple facets – we need to unpack these different facets to understand how much progress we’ve made and where the challenges really lie. Data on the web is not just about the standards and protocols promoted by the linked open data cloud; in a wider interpretation, it amounts to the (sparsely linked) graph of web tables embedded in documents, to millions of online datasets in various formats, but also to charts that present data in accessible ways. The web of data is a mechanism to publish and reuse data, a social network, a marketplace, and a platform to help train the AIs of this world, all affected to a larger or lesser degree by technopolitics. I will discuss these different interpretations, supported by studies into open data portals, data communities, and crowdsourced datasets, and deep-dive into technical, user experience, innovation and policy questions and their impact on present and future developments in this space.
关 键 词: 网络; 数据; 开放数据门户
课程来源: 视频讲座网
数据采集: 2022-04-13:zkj
最后编审: 2022-04-13:zkj
阅读次数: 43