0


大规模数据挖掘:MapReduce及其超越

Large-scale Data Mining: MapReduce and Beyond
课程网址: http://videolectures.net/kdd2010_papadimitriou_sun_yan_lsdm/  
主讲教师: Spiros Papadimitriou; Jimeng Sun; Rong Yan
开课单位: 沃森研究中心
开课时间: 2010-10-01
课程语种: 英语
中文简介:

数据变得空前大量。这种规模上的差异就是种类上的差异,带来了新的机遇。近年来,Map reduce在大规模数据处理和挖掘方面引起了很多关注。在本教程中,我们将介绍Map reduce及其在数据挖掘中的应用和研究。特别是,我们想回答以下问题:

•什么是Mapreduce?为什么我们需要它进行数据挖掘? •Map需要减少哪些采矿应用程序? •使用Map Reduce的优点和局限性是什么? •如何使用Map reduce? •还有其他用于大规模数据处理和挖掘的工具吗?更具体地说,本教程分为三个部分:

1。MapReduce基础包括MapReduce编程模型,系统架构,其OpenSource实现Hadoop及其扩展,例如HBase,Pig,Cascading,Hive。

2.MapReduce算法涵盖了标准数据挖掘算法的MapReduce实现,例如聚类(K均值),分类(k NN,朴素贝叶斯),图挖掘(页面排名)。

3.MapReduce应用程序提供了MapReduce的常规应用程序,它们超出了数据挖掘的范围,包括文本处理,数据仓库。

课程简介: Data are becoming available in unprecedented volumes. This difference in scale is difference in kind, presenting new opportunities. Map-reduce has drawn a lot of attention recent years for large-scale data processing and mining. In this tutorial, we introduce Map-reduce and its application and research in data mining. In particular, we want to answer the following questions: •What is Map-reduce and why do we need it for data mining? •What mining applications need Map-reduce? •What are the advantages and limitations using Map-Reduce? •How do you use Map-reduce? •What are other tools out there for large-scale data processing and mining? More specifically, this tutorial is organized into three parts: 1.MapReduce basic includes MapReduce programming model, system architecture, its OpenSource implementation Hadoop and its extensions such as HBase, Pig, Cascading, Hive. 2.MapReduce algorithms cover MapReduce implementation of standard data mining algorithms such as clustering (K-means), classification (k-NN, naive Bayes), graph mining (page rank). 3.MapReduce applications present the general applications of MapReduce that are beyond data mining, which include text processing, data warehousing.
关 键 词: 编程模型; 数据挖掘
课程来源: 视频讲座网
数据采集: 2020-10-29:zyk
最后编审: 2020-10-29:zyk
阅读次数: 36