首页生物数学
   首页机械学
0


计算生物学中的一些挑战性机器学习问题:时变网络推理和稀疏结构化输入法学习

Some Challenging Machine Learning Problems in Computational Biology: Time-Varying Networks Inference and Sparse Structured Input-Out Learning
课程网址: http://videolectures.net/cmulls08_xing_scml/  
主讲教师: Eric P. Xing
开课单位: 卡内基梅隆大学
开课时间: 2009-01-05
课程语种: 英语
中文简介:
在诸如微阵列和全基因组测序等高通量技术的最新进展导致了新的生物数据的雪崩,这些数据是动态的、嘈杂的、异质的和高维的。它们在机器学习和高维统计分析方面提出了前所未有的挑战;它们与人类健康和社会福利密切相关,往往对性能指标提出了不同于标准数据挖掘或模式识别问题的独特要求。在这次谈话中,我将讨论其中两个问题。首先,我将提出一种新的统计形式来模拟网络随时间的演化,以及一些基于稀疏图形逻辑回归的时间扩展的新算法,用于简化对潜在时变网络的逆向工程。我将展示一些有希望的结果,从微阵列时间进程中恢复果蝇生命周期中超过4000个基因的暂时重组基因网络的潜在序列,时间分辨率仅受样本频率限制。其次,我将在揭示基因组中连锁遗传变异(输入)与现象中人类特征网络(输出)之间真实联系的背景下,提出一系列稀疏结构回归模型。如果时间允许,我还将提出另一类新的模型,称为最大熵鉴别马尔可夫网络,它解决了最大边缘范式中的相同问题,但使用熵正则化器,导致同时具有原始和双重稀疏(即,很少有SU)的结构化预测函数分布。pport向量,有效特征维数低)。与AMR Ahmed、Seyong Kim、Mladen Kolar、Le Song和Jun Zhu的联合工作。
课程简介: Recent advances in high-throughput technologies such as microarrays and genome-wide sequencing have led to an avalanche of new biological data that are dynamic, noisy, heterogeneous, and high-dimensional. They have raised unprecedented challenges in machine learning and high-dimensional statistical analysis; and their close relevance to human health and social welfare has often created unique demands on performance metric different from standard data mining or pattern recognition problems. In this talk, I will discuss two of such problems. First, I will present a new statistical formalism for modeling network evolution over time, and several new algorithms based on temporal extensions of the sparse graphical logistic regression, for parsimonious reverse-engineering the latent time varying networks. I will show some promising results on recovering the latent sequence of temporally rewiring gene networks over more than 4000 genes during the life cycle of Drosophila melanogaster from microarray time course, at a time resolution only limited by sample frequency. Second, I will present a family of sparse structured regression models in the context of uncovering true associations between linked genetic variations (inputs) in the genome and networks of human traits (outputs) in the phenome. If time allows, I will also present another class of new models known as the maximum entropy discrimination Markov networks, which address the same problem in the maximum margin paradigm, but using a entropic regularizer that lead to a distribution of structured prediction functions that are simultaneously primal and dual sparse (i.e., with few support vectors, and of low effective feature dimension). Joint work with Amr Ahmed, Seyoung Kim, Mladen Kolar, Le Song and Jun Zhu.
关 键 词: 生物计算学; 机械学习; 数学模型
课程来源: 视频讲座网
最后编审: 2020-06-22:chenxin
阅读次数: 46