
Data analytics involving text
课程网址: http://videolectures.net/single_mladenic_data_analytics/  
主讲教师: Dunja Mladenić
开课单位: 若泽夫·斯特凡研究所
开课时间: 2015-10-29
课程语种: 英语
以电子形式提供的数据为数据分析领域提供了巨大的机遇和挑战。除了在数据库中组织的越来越多的传统数据外,我们还面临着以不同形式提供的大量数据,如文本、传感器测量、用户的数字痕迹、图像和视频。在科学中有大量的天文学、高能物理学、生态学、遗传学和分子生物学的数据流。此外,技术的可访问性使人们能够收集生活各个方面的数据,包括细粒度的人类行为、媒体文本和视频流、社交媒体互动记录。因此,数据分析在科学和生活的不同领域越来越融合。与文本数据相关,有两方面的挑战——一方面是我们处理数百万个文档,另一方面是处理单个文档。后者可能要求更高,因为它涉及到处理文本的真正要点,即“文本理解”。演讲将讨论几个相关问题,重点是数据分析,其中包括应用程序的实际示例。 “这是一个信息时代——每个人都可以了解任何事情。没有秘密,因此也没有神圣。”
课程简介: Data available in electronic form provides great opportunities and challenges for the field of data analytics. In addition to the growing amount of traditional data organized in databases, we are facing large amounts of data provided in different forms, such as texts, sensor measurements, digital traces of users, images and video. In science there are massive data stream of astronomy, high-energy physics, ecology, genetics and molecular biology. Moreover, accessibility of technology is enabling collection of data on various aspects of life including fine-grained human behavior, streams of media text and video, records from social media interactions. Data analytics is thus becoming ever more integrated in different fields of science and life in general. Related to text data, there are two sides of challenges - one when we deal with millions of documents and other when we deal with a single document. The latter may be even more demanding as it relates to the true gist of dealing with text, namely 'text understanding'. The talk will address several related issues focusing on data analytics that involve text with practical examples of applications. "This is the Information Age - everybody can be informed about anything and everything. There is no secret, therefore there is no sacredness."
关 键 词: 数据分析; 文本数据; 数据形式
课程来源: 视频讲座网
数据采集: 2023-05-29:chenxin01
最后编审: 2023-05-29:chenxin01
阅读次数: 17