机器学习、不确定信息和负“概率”的必然性Machine Learning, Uncertain Information, and the Inevitability of Negative `Probabilities' |
|
课程网址: | http://videolectures.net/mlws04_lowe_mluii/ |
主讲教师: | David Lowe |
开课单位: | 不列颠哥伦比亚大学 |
开课时间: | 2007-02-25 |
课程语种: | 英语 |
中文简介: | //“概率古典世界与量子世界方程之间的唯一区别在于,不知何故,似乎概率必须变为负面......这就是根本问题。我不知道它的答案,但我想解释一下,如果我尽力让方程看起来尽可能接近经典概率计算机可以模仿的东西,我就会遇到麻烦'//这些是Richard Feynman在着名的关于模拟物理计算机的主题演讲中所说的话。他指出,如果我们想通过模拟计算系统的行为来理解世界,我们必须面对内在的概念上的困难。实际上,我们不必像量子物理那样深奥。我们在数据驱动建模中的概率估计器中看到机器学习和推理中的一些相同问题。就像费曼不知道解决问题的方法一样,我们才刚刚开始意识到机器智能方面的一些问题。我们现在接受的机器智能原则方法是通过概率观点。贝叶斯推论的观点是主观的,我们对宇宙的认识源于观察。但我认为使用机器学习来表示或模拟宇宙只允许一般非正概率!当然,我们可以捏造一些这些问题引起的更令人不安的方面,但它仍然应该让我们思考我们是否有正确的工作框架。在这次演讲中,我想问一下我们在机器学习中使用的部分工作机器。从本质上讲,我想挑战概率必须是积极的假设。我想提出一些描述性和形式性的论点来说明为什么正概率的使用是一种既过度限制又不可实现的理想。事实上,我认为使用非正面“概率”既是不可避免的,也是自然的。要做到这一点,我需要使用经典统计学中的一些旧的数学思想和信息理论中的一些更现代的思想。我将使用一些简单的例子和机器学习的证明应用于回归和分类任务,并与一些基本的量子理论思想相提并论。论证的核心是,在通过机器学习对宇宙进行建模时,我们不得不根据有限因素进行推断,因此通常不完整的信息。我们永远不会知道关于某种情况的一切,这给了我们通过机器学习在量子力学和统计推断之间的联系。我将试图通过任何有限数据驱动的计算推断出这种明显的“概率”问题。所以这个问题不只是与量子力学联系在一起,而且是一个更普遍的问题,与试图通过机器学习思想模拟经典概率有关。如果我们有足够的时间,我还将讨论其对熵等信息测量的影响,并将Fisher信息作为我们对系统知识状态的更合适的衡量标准。 |
课程简介: | //`The only difference between a probabilistic classical world and the equations of the quantum world is that somehow or other it appears as if the probabilities would have to go negative ... that's the fundamental problem. I don't know the answer to it, but I wanted to explain that if I try my best to make the equations look as near as possible to what would be imitable by a classical probabilistic computer, I get into trouble'// These are the words of Richard Feynman in a famous keynote talk on Simulating Physics with Computers. He was pointing out that we have to face an intrinsic conceptual difficulty if we want to understand the world through mimicking its behaviour with computational systems. Actually, we do not have to go as esoteric as quantum physics. We see some of the same issues in Machine Learning and inference from probabilistic estimators in data-driven modelling. And in the same way that Feynman did not know the resolution to his problem, we are only just starting to become aware of some of our own problems in machine intelligence. The principled approach to Machine Intelligence that we have now come to accept is through a probabilistic viewpoint. The Bayesian view of inference is a subjective one and our knowledge of the universe derives from observation. But I will argue that the use of Machine Learning to represent or simulate the universe only allows generically non-positive probabilities! Of course, we can fudge some of the more uncomfortable aspects that some of these issues raise, but it still should make us think about whether we have got the correct working framework. In this talk I want to question parts of our working machinery we use in Machine Learning. At its heart I want to challenge the assumption that probabilities have to be positive. I want to give several arguments, descriptive and formal, to indicate why the use of positive probabilities is an ideal which is both overly restrictive and unrealisable. Indeed I will argue that the use of non-positive `probabilities' is both inevitable and natural. To do this I will need to use some old mathematical ideas from classical statistics and some more modern ideas from information theory. I will use some simple examples and proofs from Machine Learning applied to regression and classification tasks, and draw parallels with some basic quantum theory ideas. The core of the argument is that in modelling the universe through Machine Learning, we are obliged to make inferences based on finite and hence typically less-than-complete information. We can never know everything about a situation, and this gives us our link between quantum mechanics and statistical inference through machine learning. I will try to make a case that inference through any finite data-driven computation leads to this apparent problem with `probabilities'. So the issue is not just connected with quantum mechanics, but is a more generic problem related to trying to simulate even classical probabilities by Machine Learning ideas. If we have enough time, I will also discuss the consequences of this for information measures such as Entropy, and make the case for Fisher Information being a more appropriate measure for our state of knowledge about a system instead. |
关 键 词: | 概率; 量子世界; 计算系统 |
课程来源: | 视频讲座网 |
最后编审: | 2019-09-05:lxf |
阅读次数: | 57 |