0


点对点网络中的分布式分类研究

Distributed Classification in Peer-to-Peer Networks
课程网址: http://videolectures.net/kdd07_luo_dcip/  
主讲教师: Ping Luo
开课单位: 中国科学院
开课时间: 信息不详。欢迎您在右侧留言补充。
课程语种: 英语
中文简介:
本文研究了P2P网络中的分布式分类问题。虽然在分布式分类中有大量的工作,但是现有的大多数算法都不是为P2P网络设计的。事实上,P2P网络作为一种无服务器、无路由器的系统,给分布式分类带来了许多挑战:(1)在大规模的P2P网络中进行全局同步是不现实的;(2)由于对等端频繁故障和恢复而导致的拓扑结构频繁变化;(3)在E上经常进行动态数据更新。Ach对等体。本文提出了一种P2P网络中分布式分类的集成范式。在这种模式下,每一个对等体都在本地数据上构建其本地分类器,然后所有本地分类器的结果通过多个投票进行组合。为了建立局部分类器,我们采用了粘贴点的学习算法,根据局部数据在每个点上生成多个局部分类器。为了结合本地结果,我们提出了动态P2P网络中分布式多投票(DPV)协议的一般形式。该协议保持了动态网络的单点有效性,支持单点查询和连续监控两种计算模式。从理论上证明了在DPV0中发送消息的条件C0是实现上述性能的局部通信最优条件。最后,在现实P2P网络上的实验结果表明:(1)即使存在数千个局部分类器,所提出的集成范式也是有效的;(2)在大多数情况下,DPV0算法是局部的,因为投票是使用从一个非常小的区域收集的信息来处理的,而这个区域的大小与网络的大小无关。与现有的分布式多投票算法相比,DPV0具有更高的通信效率。
课程简介: This work studies the problem of distributed classification in peer-to-peer (P2P) networks. While there has been a significant amount of work in distributed classification, most of existing algorithms are not designed for P2P networks. Indeed, as server-less and router-less systems, P2P networks impose several challenges for distributed classification: (1) it is not practical to have global synchronization in large- scale P2P networks; (2) there are frequent topology changes caused by frequent failure and recovery of peers; and (3) there are frequent on-the-fly data updates on each peer. In this paper, we propose an ensemble paradigm for distributed classification in P2P networks. Under this paradigm, each peer builds its local classifiers on the local data and the results from all local classifiers are then combined by plurality voting. To build local classifiers, we adopt the learning algorithm of pasting bites to generate multiple local classifiers on each peer based on the local data. To combine local results, we propose a general form of Distributed Plurality Voting (DPV ) protocol in dynamic P2P networks. This protocol keeps the single-site validity for dynamic networks, and supports the computing modes of both one-shot query and continuous monitoring. We theoretically prove that the condition C0 for sending messages used in DPV0 is locally communication-optimal to achieve the above properties. Finally, experimental results on real-world P2P networks show that: (1) the proposed ensemble paradigm is effective even if there are thousands of local classifiers; (2) in most cases, the DPV0 algorithm is local in the sense that voting is processed using information gathered from a very small vicinity, whose size is independent of the network size; (3) DPV0 is significantly more communication-efficient than existing algorithms for distributed plurality voting.
关 键 词: 分布式分类; 拓扑结构; 集成模式
课程来源: 视频讲座网
最后编审: 2019-11-18:cwx
阅读次数: 32