0


多分辨率0-1数据中统计显著模式的保存

Preservation of Statistically Significant patterns in Multiresolution 0-1 Data
课程网址: http://videolectures.net/prib2010_adhikari_pssp/  
主讲教师: Prem Raj Adhikari
开课单位: 阿尔托大学
开课时间: 2010-09-23
课程语种: 英语
中文简介:
生物学中的测量是通过高吞吐量和高分辨率技术进行的,通常会产生多分辨率的数据。目前,可用的标准算法只能处理一种分辨率的数据。生成模型(如混合模型)通常用于对此类数据建模。然而,由生成模型生成的模式的重要性迄今为止还没有得到足够的重视。本文分析了不同分辨率采样中保留模式的统计意义,以及从生成模型中采样时保留模式的统计意义。此外,我们还研究了噪声对改变分辨率和样本大小的可能性的影响。多元伯努利分布的有限混合被用来模拟癌症的多分辨率放大模式。在原始数据中识别出具有统计意义的项集,并利用随机方法从生成模型中抽取数据,研究了它们之间的关系。结果表明,混合模型能有效地保留统计显著项集。与精细分辨率相比,粗分辨率保存更准确。此外,噪声对高分辨率和小样本数数据的影响要高于低分辨率和高样本数数据。
课程简介: Measurements in biology are made with high throughput and high resolution techniques often resulting in data in multiple resolutions. Currently, available standard algorithms can only handle data in one resolution. Generative models such as mixture models are often used to model such data. However, significance of the patterns generated by generative models has so far received inadequate attention. This paper analyses the statistical significance of the patterns preserved in sampling between different resolutions and when sampling from a generative model. Furthermore, we study the effect of noise on the likelihood with respect to the changing resolutions and sample size. Finite mixture of multivariate Bernoulli distribution is used to model amplification patterns in cancer in multiple resolutions. Statistically significant itemsets are identified in original data and data sampled from the generative models using randomization and their relationships are studied. The results showed that statistically significant itemsets are effectively preserved by mixture models. The preservation is more accurate in coarse resolution compared to the finer resolution. Furthermore, the effect of noise on data on higher resolution and with smaller number of sample size is higher than the data in lower resolution and with higher number of sample size.
关 键 词: 计算机科学; 机器学习; 模式识别
课程来源: 视频讲座网
最后编审: 2019-11-18:cwx
阅读次数: 34