开发多模态特征提取的综合框架Developing a Comprehensive Framework for Multimodal Feature Extraction |
|
课程网址: | https://videolectures.net/videos/kdd2017_mcnamara_comprehensive_f... |
主讲教师: | Quinten McNamara |
开课单位: | KDD 2017研讨会 |
开课时间: | 2017-10-09 |
课程语种: | 英语 |
中文简介: | 特征提取是许多应用数据科学工作流程的关键组成部分。近年来,人工智能和机器学习的快速发展导致了特征提取工具和服务的爆炸式增长,这些工具和服务使数据科学家能够沿着广泛的维度廉价有效地注释他们的数据,从检测图像中的人脸到分析连贯文本中表达的感觉。不幸的是,功能强大的特征提取服务的激增反映在特征提取服务不同接口数量的相应扩展上。在几乎每个新服务都有自己的API、文档和/或客户端库的世界里,需要结合从多个来源获得的不同特征的数据科学家往往被迫编写和维护越来越复杂的特征提取管道。为了应对这一挑战,我们引入了一个新的开源框架,用于全面的多模态特征提取。Pliers是一个开源Python包,支持对各种数据类型(视频、图像、音频和文本)进行标准化注释,并且明确地考虑了易用性和可扩展性。用户只需几行Python代码,就可以将各种预先存在的特征提取工具应用于他们的数据,还可以通过编写模块化类轻松添加自己的自定义提取器。基于图形的API能够快速开发复杂的特征提取管道,以单一的标准化格式输出结果。我们描述了该软件包的架构,详细介绍了它与以前的特征提取工具箱相比的主要优势,并使用一个大型功能MRI数据集的示例应用程序来说明钳子如何显著减少构建复杂特征提取工作流程所需的时间和精力,同时提高代码的清晰度和可维护性。 |
课程简介: | Feature extraction is a critical component of many applied data science workflows. In recent years, rapid advances in artificial intelligence and machine learning have led to an explosion of feature extraction tools and services that allow data scientists to cheaply and effectively annotate their data along a vast array of dimensions---ranging from detecting faces in images to analyzing the sentiment expressed in coherent text. Unfortunately, the proliferation of powerful feature extraction services has been mirrored by a corresponding expansion in the number of distinct interfaces to feature extraction services. In a world where nearly every new service has its own API, documentation, and/or client library, data scientists who need to combine diverse features obtained from multiple sources are often forced to write and maintain ever more elaborate feature extraction pipelines. To address this challenge, we introduce a new open-source framework for comprehensive multimodal feature extraction. Pliers is an open-source Python package that supports standardized annotation of diverse data types (video, images, audio, and text), and is expressly with both ease-of-use and extensibility in mind. Users can apply a wide range of pre-existing feature extraction tools to their data in just a few lines of Python code, and can also easily add their own custom extractors by writing modular classes. A graph-based API enables rapid development of complex feature extraction pipelines that output results in a single, standardized format. We describe the package's architecture, detail its major advantages over previous feature extraction toolboxes, and use a sample application to a large functional MRI dataset to illustrate how pliers can significantly reduce the time and effort required to construct sophisticated feature extraction workflows while increasing code clarity and maintainability. |
关 键 词: | 多模态特征; 综合框架; 特征提取管道 |
课程来源: | 视频讲座网 |
数据采集: | 2024-12-25:liyq |
最后编审: | 2024-12-25:liyq |
阅读次数: | 98 |