功能基因组学中的亚组发现实验Subgroup discovery experiments in functional genomics |
|
课程网址: | http://videolectures.net/solomon_gamberger_sdefg/ |
主讲教师: | Dragan Gamberger |
开课单位: | 鲁尔·博斯科维奇研究所 |
开课时间: | 2007-02-25 |
课程语种: | 英语 |
中文简介: | 功能基因组学是典型的科学发现领域,其特征是相对于实例(观察)的数量而言,属性(基因)数量非常多。在此类领域,数据过度拟合的危险至关重要。为了避免这种陷阱并实现预测器的鲁棒性,现有技术方法构建了复杂的分类器,该分类器结合了多达数千个基因(属性)的相对较弱的贡献来对疾病进行分类。这样的分类器的复杂性限制了它们的透明度,因此限制了它们可以提供的生物学见解。这项研究的目标是将构建简单而健壮的基于逻辑的分类器的方法应用于该领域,以指导专家的解释。该方法基于子组发现规则学习方法,并通过利用进入规则构建过程的要素的相关性以及形成规则的组合来限制假设搜索空间的方法进行了增强。基于超过16000个基因表达值对14种癌症类型进行分类的多功能基因组学问题用于说明该方法。一些发现的规则允许新颖的生物学解释。 |
课程简介: | Functional genomics is a typical scientific discovery domain characterized by a very large number of attributes (genes) relative to the number of examples (observations). The danger of data overfitting is crucial in such domains. To avoid this pitfall and achieve predictor robustness, state-of-art approaches construct complex classifiers that combine relatively weak contributions of up to thousands of genes (attributes) to classify a disease. The complexity of such classifiers limits their transparency and consequently the biological insight they can provide. The goal of this study is to apply to this domain the methodology of constructing simple yet robust logic-based classifiers amenable to direct expert interpretation. The approach is based on the subgroup discovery rule learning methodology, enhanced by methods of restricting the hypothesis search space by exploiting the relevancy of features that enter the rule construction process as well as their combinations that form the rules. A multi-class functional genomics problem of classifying fourteen cancer types based on more than 16000 gene expression values is used to illustrate the methodology. Some of the discovered rules allow for novel biological interpretations. |
关 键 词: | 功能基因组; 数据过度拟合; 规则构建 |
课程来源: | 视频讲座网 |
最后编审: | 2019-09-21:cwx |
阅读次数: | 45 |