0


果蝇基因表达模式的稀疏特征及术语间相互作用

Drosophila Gene Expression Pattern Annotation Using Sparse Features and Term-term Interactions
课程网址: http://videolectures.net/kdd09_ye_dgepausftti/  
主讲教师: Jieping Ye
开课单位: 密歇根大学
开课时间: 2009-09-14
课程语种: 英语
中文简介:
果蝇基因表达模式图像记录了基因表达的空间和时间动态,它们是在果蝇胚胎发生过程中阐明基因功能,相互作用和网络的有用工具。为了提供基于文本的模式搜索,伯克利果蝇基因组计划(BDGP)研究中的图像由人类策展人手动注释本体术语。我们提出了一种自动执行此任务的系统方法,因为现在需要文本描述的图像数量正在迅速增加。我们考虑改进的特征表示和新颖的学习公式来提高注释性能。对于特征表示,我们调整视觉识别问题中常用的词袋方案,以便保留BDGP研究中的图像组信息。此外,来自多个视图的图像可以自然地集成在该表示中。为了减少由单词表示引起的量化误差,我们提出了一种基于稀疏学习技术的改进的特征表示方案。在学习制定的设计中,我们提出了一个局部正则化框架,可以明确地将术语之间的相关性结合起来。我们进一步表明,由此产生的优化问题可以提供分析解决方案。实验结果表明,基于稀疏学习的表示显着优于单词表示。结果还表明,术语术语相关性的结合一致地改善了注释性能。
课程简介: The Drosophila gene expression pattern images document the spatial and temporal dynamics of gene expression and they are valuable tools for explicating the gene functions, interaction, and networks during Drosophila embryogenesis. To provide text-based pattern searching, the images in the Berkeley Drosophila Genome Project (BDGP) study are annotated with ontology terms manually by human curators. We present a systematic approach for automating this task, because the number of images needing text descriptions is now rapidly increasing. We consider both improved feature representation and novel learning formulation to boost the annotation performance. For feature representation, we adapt the bag-of-words scheme commonly used in visual recognition problems so that the image group information in the BDGP study is retained. Moreover, images from multiple views can be integrated naturally in this representation. To reduce the quantization error caused by the bag-of-words representation, we propose an improved feature representation scheme based on the sparse learning technique. In the design of learning formulation, we propose a local regularization framework that can incorporate the correlations among terms explicitly. We further show that the resulting optimization problem admits an analytical solution. Experimental results show that the representation based on sparse learning outperforms the bag-of-words representation significantly. Results also show that incorporation of the term-term correlations improves the annotation performance consistently.
关 键 词: 果蝇基因; 基因表达; 基因组计划
课程来源: 视频讲座网
最后编审: 2019-05-10:lxf
阅读次数: 61