0


建筑意识数据分析:进展与未来展望

Architecture Conscious Data Analysis: Progress and Future Outlook
课程网址: http://videolectures.net/eml07_parthasarathy_acd/  
主讲教师: Srinivasan Parthasarathy
开课单位: 俄亥俄州立大学
开课时间: 2007-12-29
课程语种: 英语
中文简介:
在过去几年中,处理器设计的架构创新带来了单芯片商品处理和高端计算集群的新功能。示例包括硬件预取,同步多线程(SMT)以及最近真正的芯片多处理。在极高端,像InfiniBand这样的系统区域网络技术促进了基于集群的超级计算机的开发,这种超级计算机能够存储和管理peta字节的数据。我们认为,如果适当利用,通常需要大量计算,I / O和通信资源的数据挖掘和机器学习算法将从这些创新中受益。这样做的挑战是艰巨的。首先,大量最先进的数据挖掘算法大量利用现代处理器,即当前一代商品集群的构建块。这是由于处理器和内存性能之间的差距越来越大,以及这些应用程序的内存和I / O密集型特性。其次,多核架构的出现给商品市场带来了进一步的复杂性。引发的主要挑战包括需要增强可用的细粒度并行性并减轻内存带宽压力。第三,在多级集群环境中并行化数据挖掘算法是一项挑战,因为需要共享和传递大量数据并在存在数据偏差时平衡工作负载。在本次演讲中,我将讨论在这些挑战背景下取得的进展,并试图证明“架构意识”解决方案既可行又必要。我会尝试将一般方法和技术与特定实例分开,只要它有意义。最后,我们将讨论未来展望,包括系统支持下一代算法以及在此背景下突出的教育目标。\\这是与我的研究生Gregory Buehrer的联合工作, Amol Ghoting和Shirish Tatikonda。
课程简介: Over the past several years, architectural innovation in processor design has led to new capabilities in single-chip commodity processing and high end compute clusters. Examples include hardware prefetching, simultaneous multithreading (SMT), and more recently true chip multiprocessing. At the very high-end, systems area networking technologies like InfiniBand have spurred the development of affordable cluster-based supercomputers capable of storing and managing peta bytes of data. We contend that data mining and machine learning algorithms which often require significant computational, I/O and communication resources, stand to benefit from such innovations if appropriately leveraged. The challenges to do so are daunting. \\ First, a large number of state-of-the-art data mining algorithms grossly under-utilize modern processors, the building blocks of current generation commodity clusters. This is due to the widening gap between processor and memory performance and the memory and I/O intensive nature of these applications. Second, the emergence of multi-core architectures to the commodity market, bring with them further complications. Key challenges brought to the fore include the need to enhance available fine-grained parallelism and to alleviate memory bandwidth pressure. Third, parallelizing data mining algorithms on a multi-level cluster environment is a challenge given the need to share and communicate large sets of data and to balance the workload in the presence of data skew. \\ In this talk I will discuss progress made in the context of these challenges and attempt to demonstrate that ``architecture conscious" solutions are both viable and necessary. I will attempt to separate general methodologies and techniques from specific instantiations whenever it makes sense. We will conclude with a discussion on future outlook, both in the context of systems support for next generation algorithms as well as in terms of educational objectives brought to the fore in this context. \\ This is joint work with my graduate students Gregory Buehrer, Amol Ghoting and Shirish Tatikonda.
关 键 词: 处理器; 单芯片商品; 高端计算集群
课程来源: 视频讲座网
最后编审: 2019-04-10:lxf
阅读次数: 64