机器学习/基准通过Weka实验数据库Experiment Databases for Machine Learning / BenchMarking Via Weka |
|
课程网址: | http://videolectures.net/mloss08_reutemann_edml/ |
主讲教师: | Peter Reutemann |
开课单位: | 怀卡托大学 |
开课时间: | 2008-12-20 |
课程语种: | 英语 |
中文简介: | **机器学习实验数据库**\\机器学习实验数据库是机器学习实验的大型公共存储库,也是为特定目标生成类似数据库的框架。本项目旨在将许多机器学习实验中包含的信息整合在一起,并以一种让每个人都能调查学习算法在以前研究中的表现的方式组织起来。为了与世界共享这些信息,人们提出了一种称为ExpML的通用语言,它捕获了大量机器学习实验的基本结构,同时为将来的扩展保持开放。这种语言还通过要求链接到所使用的数据集和算法以及存储实验设置的所有细节来加强再现性。然后可以通过查询数据库访问所有存储的信息,从而创建一种强大的方法来收集和重新组织数据,从而保证对存储的结果进行非常彻底的检查。当前公开的数据库包含超过500000个分类和回归实验,并且在=,以及提供各种可视化技术的独立浏览器工具。该框架还可以集成在机器学习工具箱中,以自动将结果流式传输到全局(或本地)实验数据库,或下载以前运行过的实验**通过Weka进行基准测试**通过Weka进行基准测试是一种客户机-服务器体系结构,支持不同机器学习系统之间的互操作性。机器学习系统需要提供处理数据和评估生成模型的机制。在我们的系统中,服务器托管所有数据并执行所有统计分析,而客户端执行所有预处理和模型构建。这种任务分离为构建跨平台、跨语言的框架提供了可能性。通过在主机上执行统计分析,我们避免了对生成的结果进行不必要的交换和转换。 |
课程简介: | **Experiment Databases for Machine Learning**\\ Experiment Databases for Machine Learning is a large public repository of machine learning experiments as well as a framework for producing similar databases for specific goals. This projects aims to bring the infor- mation contained in many machine learning experiments together and organize it a way that allows everyone to investigate how learning algorithms have performed in previous studies. To share such information with the world, a common language is proposed, dubbed ExpML, capturing the basic structure of a large range of machine learning experiments while remaining open for future extensions. This language also enforces reproducibility by requiring links to the used datasets and algorithms and by storing all details of the ex- periment setup. All stored information can then be accessed by querying the database, creating a powerful way to collect and reorganize the data, thus warranting a very thorough examination of the stored results. The current publicly available database contains over 500,000 classification and regression experiments, and has both an online interface, at , as well as a stand-alone explorer tool offering various visualization techniques. This framework can also be integrated in machine learning toolboxes to automatically stream results to a global (or local) experiment database, or to download experiments that have been run before. **BenchMarking Via Weka**\\ BenchMarking Via Weka is a client-server architecture that supports interoperability between dierent machine learning systems. Machine learning systems need to provide mechanisms for processing data and evaluating generated models. In our system, the server hosts all the data and performs all the statistical analyses, while the client performs all the pre-processing and model building. This separation of tasks opens up the possibility of oering a cross-platform and cross-language framework. By performing statistical analyses on the host, we avoid unnecessary exchange and conversion of generated results. |
关 键 词: | 计算机科学; 机器学习; 实验数据库 |
课程来源: | 视频讲座网 |
最后编审: | 2021-12-23:liyy |
阅读次数: | 58 |