属性图的局部建模:算法与应用Local Modeling of Attributed Graphs: Algorithms and Applications |
|
课程网址: | http://videolectures.net/kdd2017_perozzi_attributed_graphs/ |
主讲教师: | Bryan Perozzi |
开课单位: | 石溪大学 |
开课时间: | 2017-10-09 |
课程语种: | 英语 |
中文简介: | 除了原始连接信息外,还遇到具有与节点相关联的属性的真实世界图越来越常见。例如,社交网络既包含友谊关系,也包含用户属性,如兴趣和人口统计。蛋白质-蛋白质相互作用网络不仅可以具有相互作用关系,还可以具有与蛋白质相关的表达水平。这样的信息可以通过图来描述,其中节点表示对象,边表示它们之间的关系,并且与节点相关联的特征向量表示属性。这种图形数据通常被称为属性图。本文的重点是为属性图开发可扩展的算法和模型。这些数据可以被视为离散的(边缘集)或连续的(嵌入节点之间的距离),我从两个方面来研究这个问题。具体来说,我提出了一种在线学习算法,该算法利用深度学习的最新进展来创建丰富的图嵌入。这种新方法编码的社会关系的多尺度对于网络中的多标签分类和回归任务是有用的。我还提出了离散图中异常社区评分的局部算法。这些算法发现了导致社区形成的图的属性子集(例如,社交网络上的共享兴趣)。本文中所有方法的可扩展性都是通过从一组受限的图基元构建来确保的,例如自我网络和截断随机行走,它们利用了每个顶点周围的局部信息。此外,限制我们考虑的图形依赖关系的范围,使我的方法能够使用MapReduce或Spark等用于大数据处理的商品工具进行琐碎的并行化。 这项工作的应用范围广泛,涉及数据挖掘和信息检索领域,包括用户分析/人口统计推断、在线广告和欺诈检测。 |
课程简介: | It is increasingly common to encounter real-world graphs which have attributes associated with the nodes, in addition to their raw connectivity information. For example, social networks contain both the friendship relations as well as user attributes such as interests and demographics. A protein-protein interaction network may not only have the interaction relations but the expression levels associated with the proteins. Such information can be described by a graph in which nodes represent the objects, edges represent the relations between them, and feature vectors associated with the nodes represent the attributes. This graph data is often referred to as an attributed graph. This thesis focuses on developing scalable algorithms and models for attributed graphs. This data can be viewed as either discrete (set of edges), or continuous (distances between embedded nodes), and I examine the issue from both sides. Specifically, I present an online learning algorithm which utilizes recent advances in deep learning to create rich graph embeddings. The multiple scales of social relationships encoded by this novel approach are useful for multi-label classification and regression tasks in networks. I also present local algorithms for anomalous community scoring in discrete graphs. These algorithms discover subsets of the graph’s attributes which cause communities to form (e.g. shared interests on a social network). The scalability of all the methods in this thesis is ensured by building from a restricted set of graph primitives, such as ego-networks and truncated random walks, which exploit the local information around each vertex. In addition, limiting the scope of graph dependencies we consider enables my approaches to be trivially parallelized using commodity tools for big data processing, like MapReduce or Spark. The applications of this work are broad and far reaching across the fields of data mining and information retrieval, including user profiling/demographic inference, online advertising, and fraud detection. |
关 键 词: | 属性图形; 局部建模; 算法应用 |
课程来源: | 视频讲座网 |
数据采集: | 2023-05-24:chenxin01 |
最后编审: | 2023-05-24:chenxin01 |
阅读次数: | 46 |