0


云匹配器:用于实体匹配的云/人群服务

CloudMatcher: A Cloud/Crowd Service for Entity Matching
课程网址: http://videolectures.net/kdd2017_govind_entity_matching/  
主讲教师: Yash Govind
开课单位: 威斯康星大学麦迪逊分校
开课时间: 2017-12-01
课程语种: 英语
中文简介:
实体匹配(EM)是指引用相同现实世界实体的不同数据实例。EM在卫生信息学中至关重要,在大数据和数据科学时代将变得更加重要。许多电磁系统已经被开发出来。在本文中,我们首先讨论了为什么领域科学家使用这种电磁系统仍然很困难。然后我们描述了CloudMatcher,这是我们一直在为EM构建的云/人群服务。CloudMatcher的目标是在Web上成为一个快速、易于使用、可扩展和高可用的EM服务。我们激励CloudMatcher,然后描述它的设计和实现。接下来,我们将描述它在过去六个月的部署,详细分析它在四个代表性数据集上的性能。最后,我们讨论经验教训。
课程简介: Entity matching (EM) €nds disparate data instances that refer to the same real-world entity. EM is critical in health informatics, and will become even more so in the age of Big Data and data science. Many EM systems have been developed. In this paper, we €rst discuss why it is still very dicult for domain scientists to use such EM systems. We then describe CloudMatcher, a cloud/crowd service for EM that we have been building. CloudMatcher aims to be a fast, easy-to-use, scalable, and highly available EM service on the Web. We motivate CloudMatcher then describe its design and implementation. Next, we describe its deployment in the past six months, providing a detailed analysis of its performance over four representative datasets. Finally, we discuss lessons learned.
关 键 词: 实体匹配; 数据科学; 电磁系统
课程来源: 视频讲座网
数据采集: 2023-03-20:chenxin01
最后编审: 2023-05-19:liyy
阅读次数: 22