0


使用基于规则的标签进行弱监督学习

Using Rule‑Based Labels for Weak Supervised Learning
课程网址: http://videolectures.net/kdd2018_goh_rule-based_learning/  
主讲教师: Garrett B. Goh
开课单位: 太平洋西北实验室
开课时间: 2018-11-23
课程语种: 英语
中文简介:
随着对大型数据集的访问,深度神经网络(DNN)在图像和语音识别任务中实现了人类水平的准确性。然而,化学数据本身就很小且支离破碎。在这项工作中,我们开发了一种使用基于规则的知识来训练ChemNet的方法,ChemNet是一种用于化学性质预测的可转移和可推广的深度神经网络,它以弱监督的方式从大型未标记化学数据库中学习。当结合转移学习方法来预测其他较小的化学性质数据集时,我们发现ChemNet的准确性优于使用传统监督学习训练的当代DNN模型。此外,我们证明了ChemNet预训练方法在CNN(Chemception)和RNN(SMILES2vec)模型上同样有效,这表明该方法与网络架构无关,并且在多个数据模式中有效。我们的结果表明,预先训练的ChemNet结合了化学领域的知识,并能够开发可推广的神经网络,以更准确地预测新的化学性质。
课程简介: With access to large datasets, deep neural networks (DNN) have achieved human-level accuracy in image and speech recognition tasks. However, in chemistry data is inherently small and fragmented. In this work, we develop an approach of using rule-based knowledge for training ChemNet, a transferable and generalizable deep neural network for chemical property prediction that learns in a weak-supervised manner from large unlabeled chemical databases. When coupled with transfer learning approaches to predict other smaller datasets for chemical properties that it was not originally trained on, we show that ChemNet’s accuracy outperforms contemporary DNN models that were trained using conventional supervised learning. Furthermore, we demonstrate that the ChemNet pre-training approach is equally effective on both CNN (Chemception) and RNN (SMILES2vec) models, indicating that this approach is network architecture agnostic and is effective across multiple data modalities. Our results indicate a pre-trained ChemNet that incorporates chemistry domain knowledge and enables the development of generalizable neural networks for more accurate prediction of novel chemical properties.
关 键 词: 大型数据集; 深度神经网络; 图像和语音识别任务; ChemNet预训练方法
课程来源: 视频讲座网
数据采集: 2023-01-24:cyh
最后编审: 2023-01-24:cyh
阅读次数: 29