0


走向数据挖掘和机器学习数据集的语义存储库

Towards a semantic repository of data mining and machine learning datasets
课程网址: http://videolectures.net/sikdd2018_kostovska_machine_learning_dat...  
主讲教师: Ana Kostovska
开课单位: 约瑟夫·斯特凡学院
开课时间: 2018-10-23
课程语种: 英语
中文简介:

随着我们生活中各个领域的数据呈指数增长,对开发有效数据管理的新方法的需求日益增长。即,在数据挖掘(DM)和数据库知识发现(KDD)领域,科学家经常投入大量时间和资源来收集已经采集的数据。在这种情况下,通过发布开放和公平(可查找,可访问,可互操作,可重用)的数据,研究人员可以重用以前收集,预处理和存储的数据。因此,我们对用于机器学习(ML)和数据挖掘领域的数据集的注释,存储和查询的当前方法,数据存储库和语义技术进行了广泛的审查。最后,我们确定了现有数据集存储库的局限性,并提出了一个符合FAIR原则的语义数据存储库设计,以进行数据管理和管理。

课程简介: With the exponential growth of data in all areas of our lives, there is an increasing need of developing new approaches for effective data management. Namely, in the field of Data Mining (DM) and Knowledge Discovery in Databases (KDD), scientists often invest a lot of time and resources for collec- ting data that has already been acquired. In that context, by publishing open and FAIR (Findable, Accessible, Interoperable, Reusable) data, researchers could reuse data that was previously collected, preprocessed and stored. Motivated by this, we conducted extensive review on current approaches, data repositories and semantic technologies used for annotation, storage and querying of datasets for the domain of machine learning (ML) and data mining. Finally, we identify the limitations of the existing repositories of datasets and propose a design of a semantic data repository that adheres to FAIR principles for data management and stewardship.
关 键 词: 数据挖掘; 数据库知识发现; 机器学习
课程来源: 视频讲座网
数据采集: 2020-12-09:cjy
最后编审: 2020-12-09:cjy
阅读次数: 53