0


利用随机森林识别特征相关性

Identifying Feature Relevance using a Random Forest
课程网址: http://videolectures.net/slsfs05_rogers_ifrur/  
主讲教师: Jeremy D. Rogers
开课单位: 南安普敦大学
开课时间: 2007-02-25
课程语种: 英语
中文简介:
许多特征选择算法受到限制,因为它们试图通过单独检查特征来识别相关的特征子集。本文介绍了一种使用在决策树集合构建过程中获得的平均信息增益来确定特征相关性的技术。该技术引入了节点复杂性度量和用于基于置信区间更新特征采样分布的统计方法,以控制收敛速率。实验证明了这种方法在特征选择和子空间识别方面的潜力。
课程简介: Many feature selection algorithms are limited in that they attempt to identify relevant feature subsets by examining the features individually. This paper introduces a technique for determining feature relevance using the average information gain achieved during the construction of decision tree ensembles. The technique introduces a node complexity measure and a statistical method for updating the feature sampling distribution based upon confidence intervals to control the rate of convergence. Experiments demonstrate the potential of this method for feature selection and subspace identification.
关 键 词: 特征选择算法; 特征子集; 决策树
课程来源: 视频讲座网
最后编审: 2019-09-21:cwx
阅读次数: 167