生物信息学挑战:在非常高的尺寸与非常少的样本中学习Bioinformatics Challenge: Learning in Very High Dimensions with Very Few Samples |
|
课程网址: | http://videolectures.net/mlss05au_kowalczyk_lvhdv/ |
主讲教师: | Adam Kowalczyk |
开课单位: | 澳大利亚信息通信技术研究中心 |
开课时间: | 2007-02-25 |
课程语种: | 英语 |
中文简介: | 专用的机器学习程序已经成为现代基因组学和蛋白质组学的一个组成部分。然而,这些高维度和低学习样本任务常常将这些过程扩展到其适用性的自然边界之外。一些这样的挑战将成为本系列讲座的主题。我们将从基因组学(微阵列)数据分类的简要概述开始。我们将特别详细讨论癌症基因组学和蛋白质组学的应用实例。然后我们集中讨论一个反学习现象,一个监督分类的例子,标准监督学习技术系统地产生了对学习样本完美的分类器,但是独立的测试错误率高于默认(随机)分类规则。自然和合成反学习数据的例子将给出和分析,从影响的角度到实际监督和无监督分类。将并行组织一系列实践教程。参与者将接触微阵列数据分类,包括反学习的第一手经验。 |
课程简介: | Dedicated machine learning procedures have already become an integral part of modern genomics and proteomics. However, these very high dimensional and low learning sample tasks often stretch these procedures well beyond natural boundaries of their applicability. A few such challenges will be a subject of this series of lectures. We will start with a brief overview of classification of genomics (microarray) data. In particular we shall discuss, in some detail, examples of applications to cancer genomics and proteomics. Then we concentrate on a phenomenon of anti-learning, a case of supervised classification where standard supervised learning techniques systematically produce classifiers perfect on learning sample but with independent test error rates higher than that of the default (random) classification rule. The examples of natural and synthetic anti-learning data will be given and analysed from the stand point of implications to practical supervised and unsupervised classification. A series of practical tutorials will be organized in parallel. Participants will be exposed to classification of microarray data including first-hand experience with anti-learning. |
关 键 词: | 机器学习; 基因组学; 反学习数据 |
课程来源: | 视频讲座网 |
最后编审: | 2021-01-15:yumf |
阅读次数: | 61 |