0


机器学习在云graphlab

Machine Learning in the Cloud with GraphLab
课程网址: http://videolectures.net/nipsworkshops2010_guestrin_kml/  
主讲教师: Carlos Guestrin
开课单位: 卡内基梅隆大学
开课时间: 2011-01-13
课程语种: 英语
中文简介:
数据集规模的不断增加促使机器学习专家探索使用并行和分布式计算进行研究。此外,诸如Amazon EC2之类的云计算资源已变得越来越可用,为大规模计算提供廉价且可扩展的平台。但是,由于分布式设计的复杂性,ML研究人员很难充分利用云资源。像MapReduce这样的现有高级并行抽象表达不够,而MPI和Pthreads等低级工具让ML专家反复解决相同的设计挑战。通过针对ML中的常见模式,我们开发了GraphLab,它紧凑地表达具有ML中常见的稀疏计算依赖性的异步迭代算法,同时确保数据一致性并实现高度并行性能。我们通过为各种ML任务设计和实现并行版本来展示GraphLab框架的表现力,包括学习具有近似推理的图形模型,Gibbs采样,张量因子分解,Co-EM,Lasso和压缩感知。我们展示了使用GraphLab,我们可以在大规模的实际问题上实现出色的并行性能,并在Amazon EC2上展示其可扩展性,使用多达256个处理器。
课程简介: Exponentially increasing dataset sizes have driven Machine Learning experts to explore using parallel and distributed computing for their research. Furthermore, cloud computing resources such as Amazon EC2 have become increasingly available, providing cheap and scalable platforms for large scale computation. However, due to the complexities involved in distributed design, it can be difficult for ML researchers to take full advantage of cloud resources. Existing high-level parallel abstractions like MapReduce are insufficiently expressive while low-level tools like MPI and Pthreads leave ML experts repeatedly solving the same design challenges. By targeting common patterns in ML, we developed GraphLab, which compactly expresses asynchronous iterative algorithms with sparse computational dependencies common in ML, while ensuring data consistency and achieving a high degree of parallel performance. We demonstrate the expressiveness of the GraphLab framework by designing and implementing parallel versions for a variety of ML tasks, including learning graphical models with approximate inference, Gibbs sampling, tensor factorization, Co-EM, Lasso and Compressed Sensing. We show that using GraphLab we can achieve excellent parallel performance on large-scale real-world problems and demonstrate their scalability on Amazon EC2, using up to 256 processors.
关 键 词: 分布式计算; 图形模型; 异步迭代算法
课程来源: 视频讲座网
最后编审: 2020-06-15:wuyq
阅读次数: 93