0


基于科学文献挖掘的假设自动生成

Automated Hypothesis Generation Based on Mining Scientific Literature
课程网址: http://videolectures.net/kdd2014_spangler_lichtarge_scientific_li...  
主讲教师: Olivier Lichtarge; W. Scott Spangler
开课单位: 贝勒医学院;IBM研究
开课时间: 2014-10-07
课程语种: 英语
中文简介:
跟上不断扩大的数据和出版物流是站不住脚的,也是科学进步的根本瓶颈。当前的搜索技术通常会找到许多相关的文档,但它们不会提取和组织这些文档的信息内容,也不会基于这些组织的内容提出新的科学假设。我们对KnIT进行了初步的案例研究,KnIT是一个原型系统,它挖掘科学文献中包含的信息,在可查询的网络中明确地表示信息,然后根据这些数据进一步推理,以生成新颖且可通过实验测试的假设。KnIT将实体检测与邻居文本特征分析以及基于图的信息扩散相结合,以识别现有关系强烈暗示的实体的潜在新属性。我们讨论了我们的方法的成功应用,该方法挖掘了已发表的文献,以鉴定磷酸化肿瘤抑制蛋白p53的新蛋白激酶。回顾性分析证明了这种方法的准确性,正在进行的实验室实验表明,我们的系统鉴定的激酶确实可能磷酸化p53。这些结果为基于科学文献文本挖掘的自动假设生成和发现建立了原理证明。
课程简介: Keeping up with the ever-expanding flow of data and publications is untenable and poses a fundamental bottleneck to scientific progress. Current search technologies typically find many relevant documents, but they do not extract and organize the information content of these documents or suggest new scientific hypotheses based on this organized content. We present an initial case study on KnIT, a prototype system that mines the information contained in the scientific literature, represents it explicitly in a queriable network, and then further reasons upon these data to generate novel and experimentally testable hypotheses. KnIT combines entity detection with neighbor-text feature analysis and with graph-based diffusion of information to identify potential new properties of entities that are strongly implied by existing relationships. We discuss a successful application of our approach that mines the published literature to identify new protein kinases that phosphorylate the protein tumor suppressor p53. Retrospective analysis demonstrates the accuracy of this approach and ongoing laboratory experiments suggest that kinases identified by our system may indeed phosphorylate p53. These results establish proof of principle for automated hypothesis generation and discovery based on text mining of the scientific literature.
关 键 词: 科学文献; 文献挖掘; 生成假设
课程来源: 视频讲座网
数据采集: 2023-06-11:chenxin01
最后编审: 2023-06-11:chenxin01
阅读次数: 25