0


从重叠变量的多个数据集中学习有向无环潜变量模型的等价类(Ricardo Silva的讨论)

Learning equivalence classes of directed acyclic latent variable models from multiple datasets with overlapping variables, incl. discussion by Ricardo Silva
课程网址: http://videolectures.net/aistats2011_tillman_learning/  
主讲教师: Ricardo Silva, Robert E. Tillman
开课单位: 伦敦大学学院
开课时间: 2011-05-06
课程语种: 英语
中文简介:
虽然已经有相当多的研究从预测和因果推断的数据中学习概率图形模型,但几乎所有现有的算法都假设所有变量的i.i.d.观测数据是一个单一的数据集。对于许多应用程序来说,获得这样的数据集可能是不可能的或不切实际的,但是对于这些变量的不同子集,可能有多个i.i.d.观测数据集可用。Tillman等[2009]展示了如何将从这些数据集中学到的有向图形模型集成到一起,从而在所有变量上构造等价的结构类。虽然它们的过程是正确的,但它假定所集成的结构不包含相互矛盾的条件独立和交叉变量的相互依赖。虽然这种假设渐近是合理的,但由于统计误差的频率有限,在实际应用中很少成立。我们提出了一种直接从多个数据集中学习等价类的新方法,从而避免了这一问题,具有更大的实用价值。实验结果表明,该方法不仅精度高,而且速度快,对记忆的要求也较低。
课程简介: While there has been considerable research in learning probabilistic graphical models from data for predictive and causal inference, almost all existing algorithms assume a single dataset of i.i.d. observations for all variables. For many applications, it may be impossible or impractical to obtain such datasets, but multiple datasets of i.i.d. observations for different subsets of these variables may be available. Tillman et al. [2009] showed how directed graphical models learned from such datasets can be integrated to construct an equivalence class of structures over all variables. While their procedure is correct, it assumes that the structures integrated do not entail contradictory conditional independences and dependences for variables in their intersections. While this assumption is reasonable asymptotically, it rarely holds in practice with finite samples due to the frequency of statistical errors. We propose a new correct procedure for learning such equivalence classes directly from the multiple datasets which avoids this problem and is thus more practically useful. Empirical results indicate our method is not only more accurate, but also faster and requires less memory.
关 键 词: 等价类的有向无环潜变量模型; 重叠变量
课程来源: 视频讲座网
最后编审: 2021-02-04:nkq
阅读次数: 37