分析和连接大数据的平流层Analyzing and Linking Big Data with Stratosphere |
|
课程网址: | http://videolectures.net/dataforum2012_tzoumas_big_data/ |
主讲教师: | Kostas Tzoumas |
开课单位: | 柏林工业大学 |
开课时间: | 2012-07-16 |
课程语种: | 英语 |
中文简介: | 链接和分析的演示大数据总结:在这次演讲中,我将提供两个项目在柏林工业大学的概况,并在它们的交点在研究和创新的挑战。平流层([url][url][url].stratosphere.eu,经费由德国研究基金会)是一个开放的平台,大数据分析。它具有灵活的容错方案,即扩展的MapReduce各地二阶职能为中心的新的编程模型,以及基于成本的查询优化器云功能的执行引擎。平流层是由几个用例场景,包括气候数据分析,在生物信息学文本挖掘,并链接开放数据数据清洗验证。多巴(一种FP7链球菌项目)专注于连接使用数据的供应链结构化和非结构化数据的大型数据池。我们的目标是乘以各个服务的效用,同时共享它们之间的费用。这样的DOPA降低条目的中小企业需要跨多个数据池执行高级分析,因为所需要的输入数据,以及处理环境不必由中小企业本身提供的阻挡。 |
课程简介: | Linking and Analyzing Big Data Summary of the presentation: In this talk, I will provide an overview of two projects at TU Berlin, and the research and innovation challenges in their intersection. Stratosphere ([url], funded by the German Research Foundation) is an open platform for Big Data Analytics. It features a cloud-enabled execution engine with flexible fault tolerance schemes, a novel programming model centered around second-order functions that extends MapReduce, and a cost-based query optimizer. Stratosphere is validated by several use-case scenarios, including climate data analysis, text mining in the Bioinformatics, and data cleansing on Linked Open Data. DOPA (an FP7 STREP project) focuses on linking large Data Pools of both structured and unstructured data using data supply chains. The goal is to multiply the utility of each individual service while simultaneously sharing the costs between them. This way DOPA lowers the barrier of entry for SMEs that need to perform advanced analytics across multiple data pools since the required input data as well as the processing environment do not have to be provided by the SME itself. |
关 键 词: | 平流层; 生物信息; 云功能 |
课程来源: | 视频讲座网 |
最后编审: | 2021-01-31:nkq |
阅读次数: | 48 |