0


在分层多标签分类树中使用PPI网络进行基因功能预测

Using PPI Networks in hierarchical multi-label classification trees for gene function prediction
课程网址: http://videolectures.net/mlsb2012_stojanova_ppi/  
主讲教师: Daniela Stojanova
开课单位: 约瑟夫·斯特凡学院
开课时间: 2012-10-23
课程语种: 英语
中文简介:
**动机:**目录,例如Gene Ontology(GO)和MIPS-FUN,假设功能类按层次组织(一般功能包括更具体的功能)。这是最近的在假设实例的情况下推动了几种机器学习算法的发展可能属于多个层次结构组织的类。除了班级之间的关系,还可以识别示例之间的关系。虽然这种关系已经存在在蛋白质与蛋白质相互作用(PPI)领域中进行了鉴定和广泛研究在分层蛋白质功能预测中,它们没有受到太多关注。该在基因之间使用这种关系会引入自相关并违反这一假设这些实例独立且相同地分布,这强调了大多数机器学习算法。虽然这种考虑为学习带来了额外的复杂性过程中,我们预计它也会带来实质性的好处。\\ **结果:**本文展示了考虑自相关的好处(在预测准确性方面)在多类基因功能预测中。我们开发了一种基于树的算法来考虑分层多标签分类(HMC)设置中的网络自相关。该使用MIPSFUN在24个酵母数据集上对所提出的算法(称为NHMC)进行经验评估和GO注释和利用三个不同的PPI网络,清楚地表明,采取考虑自相关可以提高性能。 **结论:**我们的结果表明,明确考虑网络自相关的增加模型的预测能力,特别是当潜在的PPI网络时稠密。此外,NHMC可用作评估网络数据及其信息的工具提供了基因功能。
课程简介: **Motivation:** Catalogs, such as Gene Ontology (GO) and MIPS-FUN, assume that functional classes are organized hierarchically (general functions include more specific functions). This has recently motivated the development of several machine learning algorithms under the assumption that instances may belong to multiple hierarchy organized classes. Besides relationships among classes, it is also possible to identify relationships among examples. Although such relationships have been identified and extensively studied in the in the area of protein-to-protein interaction (PPI) networks, they have not received much attention in hierarchical protein function prediction. The use of such relationships between genes introduces autocorrelation and violates the assumption that instances are independently and identically distributed, which underlines most machine learning algorithms. While this consideration introduces additional complexity to the learning process, we expect it would also carry substantial benefits.\\ **Results:** This article demonstrates the benefits (in terms of predictive accuracy) of considering autocorrelation in multi-class gene function prediction. We develop a tree-based algorithm for considering network autocorrelation in the setting of Hierarchical Multi-label Classification (HMC). The empirical evaluation of the proposed algorithm, called NHMC, on 24 yeast datasets using MIPSFUN and GO annotations and exploiting three different PPI networks, clearly shows that taking autocorrelation into account improves performance.\\ **Conclusions:** Our results suggest that explicitly taking network autocorrelation into account increases the predictive capability of the models, especially when the underlying PPI network is dense. Furthermore, NHMC can be used as a tool to assess network data and the information it provides with respect to the gene function.
关 键 词: 机器学习算法; 层次结构; 分层多标签分类
课程来源: 视频讲座网
最后编审: 2020-07-29:yumf
阅读次数: 67