Category Archives: Quantum Quandaries

The Choi-Jamiolkowski Isomorphism: You’re Doing It Wrong!

As the dear departed Quantum Pontiff used to say: New Paper Dance! I am pretty happy that this one has finally been posted because it is my first arXiv paper since I returned to work, and also because it has gone through more rewrites than Spiderman: The Musical.

What is the paper about, I hear you ask? Well, mathematically, it is about an extremely simple linear algebra trick called the Choi-Jamiolkwoski isomorphism. This is actually two different results: the Choi isomorphism and the Jamiolkowski isomorphism, but people have a habit of lumping them together. This trick is so extremely well-known to quantum information theorists that it is not even funny. One of the main points of the paper is that you should think about what the isomorphism means physically in a new way. Hence the “you’re doing it wrong” in the post title.

First Level Isomorphisms

For the uninitiated, here is the simplest way of describing the Choi isomorphism in a single equation:
\[\Ket{j}\Bra{k} \qquad \qquad \equiv \qquad \qquad \Ket{j} \otimes \Ket{k},\]
i.e. the ismomorphism works by turning a bra into a ket. The thing on the left is an operator on a Hilbert space \(\mathcal{H}\) and the thing on the right is a vector in \(\mathcal{H} \otimes \mathcal{H}\), so the isomorphism says that \(\mathcal{L}(\mathcal{H}) \equiv \mathcal{H} \otimes \mathcal{H}\), where \(\mathcal{L}(\mathcal{H})\) is the space of linear operators on \(\mathcal{H}\).

Here is how it works in general. If you have an operator \(U\) then you can pick a basis for \(\mathcal{H}\) and write \(U\) in this basis as
\[U = \sum_{j,k} U_{j,k} \Ket{j}\Bra{k},\]
where \(U_{j,k} = \Bra{j}U\Ket{k}\). Then you just extend the above construction by linearity and write down a vector
\[\Ket{\Phi_U} = \sum_{j,k} U_{j,k} \Ket{j} \otimes \Ket{k}.\]
It is pretty obvious that we can go in the other direction as well, starting with a vector on \(\mathcal{H}\otimes\mathcal{H}\), we can write it out in a product basis, turn the second ket into a bra, and then we have an operator.

So far, this is all pretty trivial linear algebra, but when we think about what this means physically it is pretty weird. One of the things that is represented by an operator in quantum theory is dynamics, in particular a unitary operator represents the dynamics of a closed system for a discrete time-step. One of the things that is represented by a vector on a tensor product Hilbert space is a pure state of a bipartite system. It is fairly easy to see that (up to normalization) unitary operators get mapped to maximally entangled states under the isomorphism, so, in some sense, a maximally entangled state is “the same thing” as a unitary operator. This is weird because there are some things that make sense for dynamical operators that don’t seem to make sense for states and vice-versa. For example, dynamics can be composed. If \(U\) represents the dynamics from \(t_0\) to \(t_1\) and \(V\) represents the dynamics from \(t_1\) to \(t_2\), then the dynamics from \(t_0\) to \(t_2\) is represented by the product \(VU\). Using the isomorphism, we can define a composition for states, but what on earth does this mean?

Before getting on to that, let us briefly pause to consider the Jamiolkowski version of the isomorphism. The Choi isomorphism is basis dependent. You get a slightly different state if you write down the operator in a different basis. To make things basis independent, we replace \(\mathcal{H}\otimes\mathcal{H}\) by \(\mathcal{H}\otimes\mathcal{H}^*\). \(\mathcal{H}^*\) denotes the dual space to \(\mathcal{H}\), i.e. it is the space of bras instead of the space of kets. In Dirac notation, the Jamiolkwoski isomorphism looks pretty trivial. It says
\[\Ket{j}\Bra{k} \qquad \qquad \equiv \qquad \qquad \Ket{j} \otimes \Bra{k}.\]
This is axiomatic in Dirac notation, because we always assume that tensor product symbols can be omitted without changing anything. However, this version of the isomorphism is going to become important later.

Conventional Interpretation: Gate Teleportation

In quantum information, the Choi isomorphism is usually interpreted in terms of “gate teleportation”. To understand this, we first reformulate the isomorphism slightly. Let \(\Ket{\Phi^+}_{AA’} = \sum_j \Ket{jj}_{AA’}\), where \(A\) and \(A’\) are quantum systems with Hilbert spaces of the same dimension. The vectors \(\Ket{j}\) form a preferred basis, and this is the basis in which the Choi isomorphism is going to be defined. Note that \(\Ket{\Phi^+}_{AA’}\) is an (unnormalized) maximally entangled state. It is easy to check that the isomorphism can now be reformulated as
\[\Ket{\Phi_U}_{AA’} = I_A \otimes U_A’ \Ket{\Phi^+}_{AA’},\]
where \(I_A\) is the identity operator on system \(A\). The reverse direction of the isomorphism is given by
\[U_A \Ket{\psi}\Bra{\psi}_A U_A^{\dagger} = \Bra{\Phi^+}_{A’A”} \left ( \Ket{\psi}\Bra{\psi}_{A”} \otimes \Ket{\Phi_U}\Bra{\Phi_U}_{A’A} \right )\Ket{\Phi^+}_{A’A”},\]
where \(A^{\prime\prime}\) is yet another quantum system with the same Hilbert space as \(A\).

Now let’s think about the physical interpretation of the reverse direction of the isomorphism. Suppose that \(U\) is the identity. In that case, \(\Ket{\Phi_U} = \Ket{\Phi^+}\) and the reverse direction of the isomorphism is easily recognized as the expression for the output of the teleportation protocol when the \(\Ket{\Phi^+}\) outcome is obtained in the Bell measurement. It says that \(\Ket{\psi}\) gets teleported from \(A^{\prime\prime}\) to \(A\). Of course, this outcome only occurs some of the time, with probability \(1/d\), where \(d\) is the dimension of the Hilbert space of \(A\), a fact that is obscured by our decision to use an unnormalized version of \(\Ket{\Phi^+}\).

Now, if we let \(U\) be a nontrivial unitary operator then the reverse direction of the isomorphism says something more interesting. If we use the state \(\Ket{\Phi_U}\) rather than \(\Ket{\Phi^+}\) as our resource state in the teleportation protocol, then, upon obtaining the \(\Ket{\Phi^+}\) outcome in the Bell measurement, the output of the protocol will not simply be the input state \(\Ket{\psi}\), but it will be that state with the unitary \(U\) applied to it. This is called “gate teleportation”. It has many uses in quantum computing. For example, in linear optics implementations, it is impossible to perform every gate in a universal set with 100% probability. To avoid damaging your precious computational state, you can apply the indeterministic gates to half of a maximally entangled state and keep doing so until you get one that succeeds. Then you can teleport your computational state using the resulting state as a resource and end up applying the gate that you wanted. This allows you to use indeterministic gates without having to restart the computation from the beginning every time one of these gates fails.

Using this interpretation of the isomorphism, we can also come up with a physical interpretation of the composition of two states. It is basically a generalization of entanglement swapping. If you take \(\Ket{\Phi_U}\) and \(\Ket{\Phi_{V}}\) and and perform a Bell measurement across the output system of the first and the input system of the second then, upon obtaining the \(\Ket{\Phi^+}\) outcome, you will have the state \(\Ket{\Phi_{UV}}\). In this way, you can perform your entire computational circuit in advance, before you have access to the input state, and then just teleport your input state into the output register as the final step.

In this way, the Choi isomorphism leads to a correspondence between a whole host of protocols involving gates and protocols involving entangled states. We can also define interesting properties of operations, such as the entanglement of an operation, in terms of the states that they correspond to. We then use the isomoprhism to give a physical meaning to these properties in terms of gate teleportation. However, one weak point of the correspondence is that it transforms something deterministic; the application of a unitary operation; into something indeterministic; getting the \(\Ket{\Phi^+}\) outcome in a Bell measurement. Unlike the teleportation protocol, gate teleportation cannot be made deterministic by applying correction operations for the other outcomes, at least not if we want these corrections to be independent of \(U\). The states you get for the other outcomes involve nasty things like \(U^*, U^T, U^\dagger\) applied to \(\Ket{\psi}\), depending on exactly how you construct the Bell basis, e.g. choice of phases. These can typically not be corrected without applying \(U\). In particular, that would screw things up in the linear optics application wherein \(U\) can only be implemented non-deterministically.

Before turning to our alternative interpretation of Choi-Jamiolkowski, let’s generalize things a bit.

Second Level Isomorphisms

In quantum theory we don’t just have pure states, but also mixed states that arise if you have uncertainty about which state was prepared, or if you ignore a subsystem of a larger system that is in a pure state. These are described by positive, trace-one, operators, denoted \(\rho\), called density operators. Similarly, dynamics does not have to be unitary. For example, we might bring in an extra system, interact them unitarily, and then trace out the extra system. These are described by Completely-Positive, Trace-Preserving (CPT) maps, denoted \(\mathcal{E}\). These are linear maps that act on the space of operators, i.e. they are operators on the space of operators, and are often called superoperators.

Now, the set of operators on a Hilbert space is itself a Hilbert space with inner product \(\left \langle N, M \right \rangle = \Tr{N^{\dagger}M}\). Thus, we can apply Choi-Jamiolkowski on this space to define a correspondence between superoperators and operators on the tensor product. We can do this in terms of an orthonormal operator basis with respect to the trace inner product, but it is easier to just give the teleportation version of the isomorphism. We will also generalize slightly to allow for the possibility that the input and output spaces of our CPT map may be different, i.e. it may involve discarding a subsystem of the system we started with, or bringing in extra ancillary systems.

Starting with a CPT map \(\mathcal{E}_{B|A}: \mathcal{L}(\mathcal{H}_A) \rightarrow \mathcal{L}(\mathcal{H}_B)\) from system \(A\) to system \(B\), we can define an operator on \(\mathcal{H}_A \otimes \mathcal{H}_B\) via
\[\rho_{AB} = \mathcal{E}_{B|A’} \otimes \mathcal{I}_{A} \left ( \Ket{\Phi^+}\Bra{\Phi^+}_{AA’}\right ),\]
where \(\mathcal{I}_A\) is the identity superoperator. This is a positive operator, but it is not quite a density operator as it satisfies \(\PTr{B}{\rho_{AB}} = I_A\), which implies that \(\PTr{AB}{\rho_{AB}} = d\) rather than \(\PTr{AB}{\rho_{AB}} = 1\). This is analogous to using unnormalized states in the pure-state case. The reverse direction of the isomorphism is then given by
\[\mathcal{E}_{B|A} \left ( \sigma_A \right ) = \Bra{\Phi^+}_{A’A}\sigma_{A’} \otimes \rho_{AB}\Ket{\Phi^+}_{A’A}.\]
This has the same interpretation in terms of gate teleportation (or rather CPT-map teleportation) as before.

The Jamiolkowski version of this isomorphism is given by
\[\varrho_{AB} = \mathcal{E}_{B|A’} \otimes \mathcal{I}_{A} \left ( \Ket{\Phi^+}\Bra{\Phi^+}_{AA’}^{T_A}\right ),\]
where \(^T_A\) denotes the partial transpose in the basis used to define \(\Ket{\Phi^+}\). Although it is not obvious from this formula, this operator is independent of the choice of basis, as \(\Ket{\Phi^+}\Bra{\Phi^+}_{AA’}^{T_A}\) is actually the same operator for any choice of basis. I’ll keep the reverse direction of the isomorphism a secret for now, as it would give a strong hint towards the punchline of this blog post.

Probability Theory

I now want to give an alternative way of thinking about the isomorphism, in particular the Jamiolkowski version, that is in many ways conceptually clearer than the gate teleportation interpretation. The starting point is the idea that quantum theory can be viewed as a noncommutative generalization of classical probability theory. This idea goes back at least to von Neumann, and is at the root of our thinking in quantum information theory, particularly in quantum Shannon theory. The basic idea of the generalization is that that probability distributions \(P(X)\) get mapped to density operators \(\rho_A\) and sums over variables become partial traces. Therefore, let’s start by thinking about whether there is a classical analog of the isomorphism, and, if so, what its interpretation is.

Suppose we have two random variables, \(X\) and \(Y\). We can define a conditional probability distribution of \(Y\) given \(X\), \(P(Y|X)\), as a positive function of the two variables that satisfies \(\sum_Y P(Y|X) = 1\) independently of the value of \(X\). Given a conditional probability distribution and a marginal distribution, \(P(X)\), for \(X\), we can define a joint distribution via
\[P(X,Y) = P(Y|X)P(X).\]
Conversely, given a joint distribution \(P(X,Y)\), we can find the marginal \(P(X) = \sum_Y P(X,Y)\) and then define a conditional distribution
\[P(Y|X) = \frac{P(X,Y)}{P(X)}.\]
Note, I’m going to ignore the ambiguities in this formula that occur when \(P(X)\) is zero for some values of \(X\).

Now, suppose that \(X\) and \(Y\) are the input and output of a classical channel. I now want to think of the probability distribution of \(Y\) as being determined by a stochastic map \(\Gamma_{Y|X}\) from the space of probability distributions over \(X\) to the space of probability distributions over \(Y\). Since \(P(Y) = \sum_{X} P(X,Y)\), this has to be given by
\[P(Y) = \Gamma_{Y|X} \left ( P(X)\right ) = \sum_X P(Y|X) P(X),\]
\[\Gamma_{Y|X} \left ( \cdot \right ) = \sum_{X} P(Y|X) \left ( \cdot \right )\].

What we have here is a correspondence between a positive function of two variables — the conditional proabability distribution — and a linear map that acts on the space of probability distributions — the stochastic map. This looks analogous to the Choi-Jamiolkowski isomorphism, except that, instead of a joint probability distribution, which would be analogous to a quantum state, we have a conditional probability distribution. This suggests that we made a mistake in thinking of the operator in the Choi isomorphism as a state. Maybe it is something more like a conditional state.

Conditional States

Let’s just plunge in and make a definition of a conditional state, and then see how it makes sense of the Jamiolkowski isomorphism. For two quantum systems, \(A\) and \(B\), a conditional state of \(B\) given \(A\) is defined to be a positive operator \(\rho_{B|A}\) on \(\mathcal{H}_A \otimes \mathcal{H}_B\) that satisfies
\[\PTr{B}{\rho_{B|A}} = I_A.\]
This is supposed to be analogous to the condition \(\sum_Y P(Y|X) = 1\). Notice that this is exactly how the operators that are Choi-isomorphic to CPT maps are normalized.

Given a conditional state, \(\rho_{B|A}\), and a reduced state \(\rho_A\), I can define a joint state via
\[\rho_{AB} = \sqrt{\rho_A} \rho_{B|A} \sqrt{\rho_A},\]
where I have suppressed the implicit \(\otimes I_B\) required to make the products well defined. The conjugation by the square root ensures that \(\rho_{AB}\) is positive, and it is easy to check that \(\PTr{AB}{\rho_{AB}} = 1\).

Conversely, given a joint state, I can find its reduced state \(\rho_A = \PTr{B}{\rho_{AB}}\) and then define the conditional state
\[\rho_{B|A} = \sqrt{\rho_A^{-1}} \rho_{AB} \sqrt{\rho_A^{-1}},\]
where I am going to ignore cases in which \(\rho_A\) has any zero eigenvalues so that the inverse is well-defined (this is no different from ignoring the division by zero in the classical case).

Now, suppose you are given \(\rho_A\) and you want to know what \(\rho_B\) should be. Is there a linear map that tells you how to do this, analogous to the stochastic map \(\Gamma_{Y|X}\) in the classical case? The answer is obviously yes. We can define a map \(\mathfrak{E}_{B|A}: \mathcal{L} \left ( \mathcal{H}_A\right ) \rightarrow \mathcal{L} \left ( \mathcal{H}_B\right )\) via
\[\mathfrak{E}_{B|A} \left ( \rho_A \right ) = \PTr{A}{\rho_{B|A} \rho_A},\]
where we have used the cyclic property of the trace to combine the \(\sqrt{\rho_A}\) terms, or
\[\mathfrak{E}_{B|A} \left ( \cdot \right ) = \PTr{A}{\rho_{B|A} (\cdot)}.\]
The map \(\mathfrak{E}_{B|A}\) so defined is just the Jamiolkowski isomorphic map to \(\rho_{B|A}\) and the above equation gives the reverse direction of the Jamiolkowski isomorphism that I was being secretive about earlier.

The punchline is that the Choi-Jamiolkowski isomorphism should not be thought of as a mapping between quantum states and quantum operations, but rather as a mapping between conditional quantum states and quantum operations. It is no more surprising than the fact that classical stochastic maps are determined by conditional probability distributions. If you think of it in this way, then your approach to quantum information will become conceptually simpler a lot of ways. These ways are discussed in detail in the paper.

Causal Conditional States

There is a subtlety that I have glossed over so far that I’d like to end with. The map \(\mathfrak{E}_{B|A}\) is not actually completely positive, which is why I did not denote it \(\mathcal{E}_{B|A}\), but when preceeded by a transpose on \(A\) it defines a completely positive map. This is because the Jamiolkowski isomorphism is defined in terms of the partial transpose of the maximally entangled state. Also, so far I have been talking about two distinct quantum systems that exist at the same time, whereas in the classical case, I talked about the input and output of a classical channel. A quantum channel is given by a CPT map \(\mathcal{E}_{B|A}\) and its Jamiolkowski representation would be
\[\mathcal{E}_{B|A} \left (\rho_A \right ) = \PTr{A}{\varrho_{B|A}\rho_A},\]
where \(\varrho_{B|A}\) is the partial transpose over \(A\) of a positive operator and it satisfies \(\PTr{B}{\varrho_{B|A}} = I_A\). This is the appropriate notion of a conditional state in the causal scenario, where you are talking about the input and output of a quantum channel rather than two systems at the same time. The two types of conditional state are related by a partial transpose.

Despite this difference, a good deal of unification is achieved between the way in which acausally related (two subsystems) and causally related (input and output of channels) degrees of freedom are described in this framework. For example, we can define a “causal joint state” as
\[\varrho_{AB} = \sqrt{\rho_A} \varrho_{B|A} \sqrt{\rho_A},\]
where \(\rho_A\) is the input state to the channel and \(\varrho_{B|A}\) is the Jamiolkowski isomorphic map to the CPT map. This unification is another main theme of the paper, and allows a quantum version of Bayes’ theorem to be defined that is independent of the causal scenario.

The Wonderful World of Conditional States

To end with, here is a list of some things that become conceptually simpler in the conditional states formalism developed in the paper:

  • The Born rule, ensemble averaging, and quantum dynamics are all just instances of a quantum analog of the formula \(P(Y) = \sum_X P(Y|X)P(X)\).
  • The Heisenberg picture is just a quantum analog of \(P(Z|X) = \sum_Y P(Z|Y)P(Y|X)\).
  • The relationship between prediction and retrodiction (inferences about the past) in quantum theory is given by the quantum Bayes’ theorem.
  • The formula for the set of states that a system can be ‘steered’ to by making measurements on a remote system, as in EPR-type experiments, is just an application of the quantum Bayes’ theorem.

If this has whet your appetite, then this and much more can be found in the paper.

Foundations Mailing Lists

Bob Coecke has recently set up an email mailing list for announcements in the foundations of quantum theory (conference announcements, job postings and the like). You can subscribe by sending a blank email to The mailing list is moderated so you will not get inundated by messages from cranks.

On a similar note, I thought I would mention the philosophy of physics mailing list, which has been going for about seven years and also often features announcements that are relevant to the foundations of quantum theory. Obviously, the focus is more on the philosophy side, but I have often heard about interesting conferences and workshops via this list.

Job/Course/Conference Announcements

Here are a few announcements that have arrived in my inbox in the past few days.

Perimeter Scholars International

Canada’s Perimeter Institute for Theoretical Physics (PI), in partnership with the University of Waterloo, welcomes applications to the Master’s level course, Perimeter Scholars International (PSI). Exceptional students with an undergraduate honours degree in Physics, Math, Engineering or Computer Science are encouraged to apply. Students must have a minimum of 3 upper level undergraduate or graduate courses in physics. PSI recruits a diverse group of students and especially encourages applications from qualified women candidates. The due date for applications to PSI is February 1st, 2011. Complete details are available at

Foundations Postdocs

Also a reminder that it is currently postdoc hiring season at Perimeter Institute. Although, the deadline for applications has passed, they will always consider applications from qualified candidates if not all positions have been filled. Anyone looking for a postdoc in quantum foundations should definitely apply. In fact, if you are looking for a foundations job and you have not applied to PI then you must be quite mad, since there are not a lot of foundations positions in physics to be had elsewhere. Details are here.

Quantum Interactions

I will admit that this next conference announcement is a little leftfield, but some of the areas it covers are very interesting and worthwhile in my opinion, particularly the biological and artificial intelligence applications.




The Fifth International Symposium on Quantum Interaction (QI’2010,, 27-29 June 2010, Aberdeen, United Kingdom.

Quantum Interaction (QI) is an emerging field which is applying quantum theory (QT) to domains such as artificial intelligence, human language, cognition, information retrieval, biology, political science, economics, organisations and social interaction.

After highly successful previous meetings (QI’2007 at Stanford, QI’2008 at Oxford, QI’2009 at Saarbruecken, QI’2010 at Washington DC), the Fifth International Quantum Interaction Symposium will take place in Aberdeen, UK from 27 to 29 June 2011.

This symposium will bring together researchers interested in how QT addresses problems in non-quantum domains. QI’2011 will also include a half day tutorial session on 26 June 2011, with a number of leading researchers delivering tutorial on the foundations of QT, the application of QT to human cognition and decision making, and QT inspired semantic information processing.

***Call for Papers***

We are seeking submission of high-quality and original research papers that have not been previously published and are not under review for another conference or journal. Papers should address one or more of the following broad content areas, but not limited to:

– Artificial Intelligence (Logic, planning, agents and multi-agent systems)

– Biological or Complex Systems

– Cognition and Brain (memory, cognitive processes, neural networks, consciousness)

– Decision Theory (political, psychological, cultural, organisational, social sciences)

– Finance and Economics (decision-making, mergers, corporate cultures)

– Information Processing and Retrieval

– Language and Linguistics

The post-conference proceedings of QI’2011 will be published by Springer in its Lecture Notes in Computer Science (LNCS) series. Authors will be required to submit a final version 14 days after the conference to reflect the comments made at the conference. We will also consider organizing a special issue for a suitable journal to publish selected best papers.

***Important Dates***

28th March 2011: Abstract submission deadline

1st April 2011: Paper submission deadline

1st May 2011: Notification of acceptance

1st June 2011: Camera-Ready Copy

26th June 2011: Tutorial Session

27th – 29th June 2011: Conference


Authors are invited to submit research papers up to 12 pages. All submissions should be prepared in English using the LNCS template, which can be downloaded from

Please submit online at:


Steering Committee:

Peter Bruza (Queensland University of Technology, Australia)

William Lawless (Paine College, USA)

Keith van Rijsbergen (University of Glasgow, UK)

Donald Sofge (Naval Research Laboratory, USA)

Dominic Widdows (Google, USA)

General Chair:

Dawei Song (Robert Gordon University, UK)

Programme Committee Chair:

Massimo Melucci (University of Padua, Italy)

Publicity Chair:

Sachi Arafat (University of Glasgow, UK)

Proceedings Chair:

Ingo Frommholz (University of Glasgow, UK)

Local Organization co-Chairs:

Jun Wang and Peng Zhang (Robert Gordon University, UK)

Quantum Foundations Meetings

Prompted in part by the Quantum Pontiff’s post about the APS March meeting, I thought it would be a good idea to post one of my extremely irregular lists of interesting conferences about the foundations of quantum theory that are coming up. A lot of my usual sources for this sort of information have become defunct in the couple of years I was away from work, so if anyone knows of any other interesting meetings then please post them in the comments.

  • March 21st-25th 2011: APS March Meeting (Dallas, Texas) – Includes a special session on Quantum Information For Quantum Foundations. Abstract submission deadline Nov. 19th.
  • April 29th-May 1st 2011: New Directions in the Foundations of Physics (Washington DC) – Always one of the highlights of the foundations calendar, but invite only.
  • May 2nd-6th 2011: 5th Feynman Festival (Brazil) – Includes foundations of quantum theory as one of its topics, but likely there will be more quantum information/computation talks. Registration deadline Feb. 1st, Abstract submission deadline Feb. 15th.
  • July 25th-30th 2011: Frontiers of Quantum and Mesoscopic Thermodynamics (Prague, Czech Republic) – Not strictly a quantum foundations conference, but there are a few foundations speakers and foundations of thermodynamics is interesting to many quantum foundations people.

A Reading List on the Foundations of Probability and Statistics

The continuing saga of time-travel in the quantum universe has been delayed because I have been working hard to finish writing a paper. Rest assured, it is coming in the next week or two. For now, I have been getting more interested in the foundations of probability and statistics. More accurately, I have always been interested (and opinionated) on the subject, but I have recently become interested in reading more widely around the subject, in the hope that I will actually come to know what I am talking about. The literature on this subject is vast, so I have decided to concentrate on the arguments for different conceptions of probability and how they are used to justify statistical methodology. I have also decided to concentrate on books and collections rather than listing references to original papers, except in a few instances where I could not find a collection containing an important paper. The references are generally to the most recent edition of the texts rather than the originals. I have added comments to the references that I know something about, and will add more as I read through them. If anyone thinks I have missed something vital then please mention it in the comments.

Disclosure: All links to Amazon are affiliate links.

General Introductions

T. L. Fine, Theories of Probability (Academic Press 1973)
READ This is an excellent book, but it is not for the feint of heart. Fine does not view any of the major approaches to probability as adequate, so some parts of the book are a bit ranty, but personally I do love a good rant. It covers most of the major approaches to probability theory in full gory mathematical detail. This includes, axiomatic, relative frequency, algorithmic complexity, classical, logical and subjective approaches. Fairly unique to this text is a comprehensive treatment of comparative probability, where you just have a relation of “more probable than” rather than a quantitative measure of probability. This occurs right at the beginning of the book, and may put some readers off as it is extremely technical and unfamiliar. However, once Fine gets to the more familiar territory of quantitative probability, the book becomes a lot more readable. If you are interested in the mathematical foundations of probability then you will not find a better book. A final warning is that some sections of the book are a bit out of date, since it was written in the 1970’s and there has been a lot of progress in some areas since then, e.g. in maximum entropy methods and algorithmic complexity. Nevertheless, nobody has done such a comprehensive job of covering the mathematics since this book was published.
Maria Galavotti, A Philosophical Introduction to Probability (Stanford: Center for the Study of Language and Information Publications 2005)
READ A better title for this book would be “A Histotico-Philosophical Introduction to Probability”. Galavotti covers all the standard interpretations of probability: classical, frequentist, propensity, logical, subjective; but she does so by focussing on the people that developed these views. Each chapter consists of sections devoted to individual researchers in the foundations of probability, beginning with a potted biography followed by a description of their view. This contrasts with other introductory texts, which tend to focus on a specific version of each viewpoint, e.g. von Mises theory of frequentism is usually discussed in detail, with only passing mentions of the other proponents like Venn and Reichenbach. This historical approach is useful as an entry point to the historical literature and has the advantage that it covers a wider variety of opinions than other introductory texts. There are several people who are frequently mentioned in the modern literature, but usually without a detailed description of their viewpoint. From this point of view, I found the accounts of Reichenbach, Jeffreys and Ramsay very useful. Reichenbach was a frequentist, but he took a Bayesian approach to statistical inference. Given the close association between frequentism and classical statistics on the one hand, and subjectivism and Bayesian statistics on the other, it is easy to overlook the possibility of Reichenbach’s position and to view criticisms of classical statistics as criticisms of frequentism in general. The treatment of Ramsay is especially good, as this is an area in which Galavotti has done significant scholarship. Ramsay is one of the originators of the subjective view of probability, but he is usually viewed as a pluralist about probability because he made comments to the effect that a different account of probability is required for science in his published essay. Unfortunately, Ramsay died before he completed his account of probability in the sciences. Using unpublished notebooks as sources, Galavotti argues that Ramsay was not a pluralist, and that his account of scientific probability would have been in terms of the stability of subjective probabilities. This is not completely conclusive, but represents an interesting alternative to the usual account of Ramsay’s viewpoint.

However, there are three negatives about Galavotti’s approach in this book. Firstly, given the number of viewpoints she discusses, many of the discussions are too brief to give a real understanding of the subtleties involved. Secondly, her treatment of the basic features of probability theory at the beginning of the book is rather awkward, and would be confusing for someone who had never encountered probability before (part of the awkwardness may have to do with the fact that this is a translation of the Italian original). Thirdly, mathematics is eschewed in this book, even where it would be extremely helpful. In some cases, the main objections to a viewpoint are that the mathematics does not say what the proponents would like it to say, and it is not possible to do justice to these arguments without writing down an equation or two. Therefore, even though this book has “Introduction” in the title, I cannot recommend it as a first textbook on the subject. It would be better to read something like Hacking or Gillies first, and then to use this as supplementary reading to get some historical context. Overall, this is a distinctive and original work, and provides a useful complement to more conventional textbooks on the subject.

Donald Gillies, Philosophical Theories of Probability (Routledge 2000)
READ This the the best introductory textbook on the foundations of probability from a philosophy point of view that I have read. The first part of the book treats most of the best known theories of probability: Classical, logical, frequentist, subjective Bayesain and propensities. The only mainstream interpretation that is missing is a discussion of Lewis’ conception of objective chances and the principal principle. This is a shame as it is currently one of the most fashionable, particularly amongst the philosophers of quantum theory that I associate with. Gillies does a good job of explaining the distinction between objective and subjective approaches to probability and the discussion of the merits and criticisms of each view is largely balanced and measured. Places where mathematical technicalities come up, such as infinite sample spaces and limit theorems, are described accurately. Even though the mathematical technicalities are omitted, as is appropriate in an introductory book, what he says about them is conceptually accurate. The second part of the book lays out the author’s own views on probability, which include a defence of a pluralist approach where different interpretations of probability are appropriate for different subject areas. He comes down in favour of a propensity-long run frequency view of objective chances and a subjectivist view of other probabilities, with a spectrum of other possibilities in between. There is much that I disagree with in this part of the book, but this is not a big criticism because almost all philosophical textbooks become controversial when the author discusses their own views. For completeness, here are the main points that I disagree with:

  1. I think the argument that exchangeability cannot justify statistical methodology is based on a double standard about the degree to which interpretations of probability are allowed to be approximate. Frequentist theories are given far more lenience in the degree to which they only have to approximate reality.
  2. I do not think that the “intersubjective” interpretation of probability is distinct from the usual subjective one. The distinction is based on a misunderstanding about what an “agent” is in the subjective theory. It is not necessarily and individual human being, but could be a well-programmed computer or a community that has approximately shared values. Thus, the intersubjective theory is just a special case of the usual subjective one.
  3. I do not agree with the pluralist view of probability. For example, the argument that probabilities in economics are fundamentally different from those in natural sciences is based on our ability to conduct repeatable experiments. This is a feature of our epistemic situation and not a feature of reality. For example, we could imagine a race of aliens that are able to create multiple copies of the planet Earth that are identical in all factors that are relevant for economics. They could then perform experiments about economics that have the same status as the experiments that we perform in physics. I also think that Gillies distinction fails to take account of the way probabilities are used in modern subjects such as quantum information theory, where you surely have subjective probabilities infecting our description of natural physical systems.

Despite these criticisms, which would take a whole article to explain fully, this is still an extremely good introductory text.

Ian Hacking, An Introduction to Probability and Inductive Logic (CUP 2001)
READ A general introduction geared towards philosophy students. Would be suitable for undergrads with no prior exposure to probability, but perhaps a little logic and/or naive set theory. A bit simplistic for anyone with a stronger background than that, but the later chapters may be useful to those unfamiliar with the different philosophical approaches to probability theory.
Alan Hájek, Interpretations of Probability, The Stanford Encyclopedia of Philosophy (Spring 2010 Edition), Edward N. Zalta (ed.),
READ As usual for the Stanford Encyclopedia of Philosophy, this is a good summary and starting point for references.
D. H. Mellor, Probability: A Philosophical Introduction (Routledge 2005)
READ I have to admit that this textbook on the philosophy of probability left me feeling more confused than I was when I started reading it. Perhaps this is because this is definitely a philosophy textbook and Mellor does does not shirk on the philosophical jargon from time to time, e.g. “Humean theory of causality”. One thing that I liked about the approach taken in this book is that Mellor introduces three kinds of probability — objective chances, epistemic probability and credences — right at the beginning and then goes on to discuss how each interpretation of probability reads them. This is in contrast to most other treatments, which make a broad distinction between objective and subjective interpretations and then go on to discuss each interpretation on its own without any common thread. Mellor’s approach is better because each of the interpretations of probability has its own scope, e.g. von Mises denies the relevance of anything but chances and subjectivists deny everything except credences, so this makes it clear when the different interpretations are discussing the same thing, and when they are attempting to reduce one concept to another. The only problem with this approach is that it suggests that there definitely are three types of probability and hence it effectively presupposes that a pluralist approach is going to be needed. I would prefer to say that there appear to be three types of probability statement in the language we use to discuss the theory, and that an interpretation of probability has to supply a meaning for each kind of statement, without assuming at the outset that the different types of statement actually do correspond to distinct concepts. Unlike Gillies, this book does include extensive discussion of the principal principle and its relatives, which is a good thing. However, I found Mellor’s discussion of things like limits and infinite samples spaces to be much more misleading than the discussion in Gillies. For example, when he introduces the concept of limit he uses an example of a function that tends to its limit uniformly and from one side. This is unlike probabilistic limits which can be subject to large fluctuations. He also suggests at one point that the only probabilities that make sense on an infinite sample space are zero and one, before going on to correct himself. Whilst he eventually does get the concepts right, these sort of statements are apt to confuse. Now, in an introductory philosophical text, I do not expect every mathematical concept to be treated in full rigour, but Gillies shows that it is possible to discuss these concepts at a heuristic level without saying anything inaccurate. Finally, Mellor has a tendency to assume that the mathematics does say what the advocates of each interpretation want it to say and then goes on to criticize at a more conceptual level, whereas I think that some of the most effective arguments against interpretations of probability are just that the mathematics does not say what they need it to say. For all these reasons, I would recommend this as a supplementary text, particularly for philosophers, but not as a first introductory text on the subject.

General Collections

These are collections of papers that are not specific to a single approach. Collections on specific approaches are listed in the appropriate sections.

Antony Eagle (ed.), Philosophy of Probability: Contemporary Readings (Routledge 2010). Due to be published Nov. 19th.
UNREAD Contains many of the classic papers including de Finetti, Popper and Lewis, as well as modern commentary.


Ian Hacking, The Emergence of Probability: A Philosophical Study of Early Ideas about Probability, Induction and Statistical Inference, 2nd edition (CUP 2006)
READ In this book, Hacking takes a look at the emergence of the concept of probability during the enlightenment. The history goes up to Bernoulli’s proof of the first limit theorem. Hacking’s goal in looking at the history is to defend a philosophical thesis. Modern debates on the foundations of probability and statistics are focussed on whether probability should be regarded as an objective physical concept (what Hacking calls aleatory probability), usually framed in terms of frequencies or propensities, or as an epistemic concept concerning our knowledge and belief. Hacking argues that it was essential to the development of the probability concept that both of these ideas arose in tandem. The history is fascinating and Hacking’s argument provides a lot of context for modern debates.
Ian Hacking, The Taming of Chance (CUP 1990).
UNREAD My impression is that it covers a later period of history than the previous text.
David Salsburg, The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century (Holt McDougal 2002)
READ This is a popular-level book about mathematical statistics. It is a very difficult subject to write a popular book about. Most popular science books are about the weird and wonderful things that we have discovered about reality, but this book is about the evolution of the methods that we use to justify such discoveries, and hence it is one level more abstract. It is really difficult to convey the content of the different approaches without using much mathematics. Salsburg’s approach is historical and he largely tackles developments in chronological order. The first half of the book covers Pearson, Fisher and Neyman-Pearson(II) and the invention of mathematical statistics in the twentieth century. He paints a vivid picture of the personalities involved and conveys how their ideas revolutionized the scientific method. I learnt a lot from this part of the book. In particular, I did not realize the extent to which the attempt to prove Darwinian evolution was a driving force in the development of mathematical statistics and I also did not realize how important UCL was in the early development of the subject. The second part of the book covers more modern developments and is less successful. The reason for this is that research in statistics has expanded into a vast number of different directions and is still a work in progress. Therefore, it is not clear what future generations will think the most important developments are. Salsburg deals with this by basing most of the remaining chapters on the life and theory of individual statisticians. However, these sketches are too brief to leave much of a lasting impression on the reader. There is also a brief discussion of classical vs. Bayesian statistics. Whilst the Bayesian methodology is given some credit, Salsburg is pretty firmly in the classical camp and the dismissal of Bayesianism essentially boils down to the fact that he thinks it is too subjective. I would have liked to have seen a more balanced discussion of this. Overall, this is an interesting book and I recommend it to anyone interested in the foundations of probability and statistics as supplementary reading. Salsburg is to be admired for attempting a popularization of such an important topic, but I have my doubts about how much readers with no prior background in statistics will get out of the book.
Stephen M. Stigler, The History of Statistics: The Measurement of Uncertainty before 1900 (Harvard University Press 1986)
UNREAD A very well regarded treatment of the early history of statistics.
Jan von Plato, Creating Modern Probability: Its Mathematics, Physics and Philosophy in Historical Perspective (CUP 1994)
UNREAD Said to be selective in its treatment of the history. Attempts to unify von Mises with de Finetti at the end of the book.

@scidata pointed me to this correspondence (pdf) between Fermat and Pascal, which provides a record of early ideas about probability.

Classical (Laplacian) Approach To Probability

Pierre Simon Marquis de Laplace, A Philosophical Essay on Probabilities (Dover 1996)
UNREAD One of the earliest attempts to outline the theory of probability. Origin of the principle of indifference.


Richard von Mises, Probability, Statistics and Truth (Dover 1981)
UNREAD The canonical work on the ensemble-based, frequentist approach to probability.

Subjective/Personalist Bayesianism

As you may be able to detect from the structure of the list, this is my current favored approach, and I have a particular fondness for the works of de Finetti and Jeffrey. This may change as I read further into the subject.

José M. Bernardo and Adrian F. M. Smith, Bayesian Theory (Wiley 2000)
READ (well, at least the first few chapters). The modern technical “bible” of subjective Bayesianism. Contains a very intricate decision theoretic derivation of probability theory that is much more complicated than Savage as well as virtually every theorem that crops up in subjective foundations.
Bruno de Finetti, Theory of Probability: A Critical Introductory Treatment, 2 volumes (Wiley 1990)
READ. Despite the title, this is not really suitable for those without a background in probability theory or the foundational debate. Contains the loss-function approach where one takes previsions (the subjective correlate of expectation values) as fundamental rather than probabilities. Also contains extensive discussion of the finite vs. countable additivity debate and the de Finetti representation theorem.
Bruno de Finetti, Probabilism (1989), Erkenntnis, 31:169-223.
PARTLY READ. English translation of Probabilismo, which was de Finetti’s first work on subjective probability from 1937. Needs to be read in conjunction with Richard Jeffrey, Reading Probabilismo (1989), Erkenntnis, 31:225-237.
Bruno de Finetti, Philosophical Lectures on Probability, edited by Alberto Mura (Springer 2008)
READ Based on transcripts of a graduate course given by de Finetti in 1979. This is for die-hard de Finetti fans only. He was obviously pretty senior when he gave this course and there is a lot of repetition. It is useful if you are a scholar who wants to pin down precisely what the later de Finetti’s ideas on fundamental topics were. Everyone else should read de Finetti’s textbook instead.
Richard Jeffrey, Subjective Probability: The Real Thing (CUP 2004). Free pdf version
READ A very readable introduction to the basics of the subjective approach. Also discusses Jeffrey conditioning (a generalization of Bayes’ rule) and applications to confirmation theory.
Richard Jeffrey, The Logic of Decision 2nd edition (University of Chicago Press 1990)
READ A philosopher’s account of the decision theoretic foundations of subjective probability. Jeffrey’s approach to decision theoretic foundations differs from the more commonly used approach of Savage in that he ascribes both probabilities and utilities to propositions, whereas Savage assigns probabilities to “states of the world” and utilities to “acts”. In general, Jeffrey also allows utilities to change as the state of belief changes, which helps to solve problems with the Bayesian treatment of things like the prisoner’s dilemma and Newcomb’s paradox. The representation theorems in Jeffrey’s approach are not as strong as in Savage’s, i.e. the probability function is not quite unique unless utilities are unbounded. Nevertheless, this is an interesting and arguably more realistic approach to decision theory as it should be applied in the real world. Finally, this book contains a comprehensive treatment of Jeffrey conditioning, which is a generalization of Bayesian conditioning to the case where an observation does not make any event in the sample space certain.
H. E. Kyburg and Howard E. Smokler (eds.), Studies in Subjective Probability (Wiley 1964)
READ This collection is mainly of historical interest in my view. The most relevant paper for contemporary Bayeisanism is de Finetti’s, which is available from a variety of other sources. The collection starts with an excerpt from Venn’s book, which sets the stage by outlining common objections to subjective approaches to probability (Venn was one of the first to present a detailed relative frequency theory). The other paper that I found interesting is Ramsey’s, since this was the first paper to present the modern subjective approach to probability based on Dutch book and decision theoretic arguments.
Leonard J. Savage, The Foundations of Statistics (Dover 1972)
READ The canonical work on decision theoretic foundations of the subjective approach.

Logical Probabilities

Rudolf Carnap, Logical Foundations of Probability (University of Chicago Press 1950)
UNREAD Supposedly one of the best worked out treatments of logical probability.
John Maynard Keynes, A Treatise On Probability (MacMillan 1921) – free ebook available from project Guttenberg
UNREAD Supposedly more readable than Carnap. WARNING – Because this book is out of copyright there are numerous editions available from online bookstores that are of dubious quality. This is why I am not linking to any of the dozens of versions on Amazon. The best advice is to use the Guttenberg ebook or look for an edition from a reputable publisher in a bricks and mortar bookshop. (Irrelevant fact: according to my mother, my maternal grandmother worked as a maid for Keynes.)

Objective Bayesianism and MaxEnt

Arguably, objective Bayesianism is the same thing as logical probabilities, but since I rarely hear people mention Jaynes and Cox in the same breath as Carnap and Keynes, I have decided to give the former their own section. Jaynes, in particular, if far more focussed on methodology and applications than the earlier authors.

Richard T. Cox, The Algebra of Probable Inference (The John’s Hopkin’s Press 1961)
UNREAD Contains the Cox axioms that characterize probability theory as an extension of logic.
Solomon Kullback, Information Theory and Statistics (Dover 1968)
UNREAD Origin of the minimization of relative entropy as an update rule in statistics. Closely related to MaxEnt.
Edwin T. Jaynes, Probability Theory: The Logic of Science
(CUP 2003)
UNREAD The doyen of MaxEnt, in his own words.

Propensities and Objective Chances

Charles Sanders Peirce, Philosophical Writings of Peirce
, edited by Justus Buchler (Dover 1955)
UNREAD Peirce foreshadows the propensity concept of Popper in papers 11-14.
Karl R. Popper, The Propensity Interpretation of the Calculus of Probability and the Quantum Theory (1957) in S. Körner (ed.), The Colston Papers, 9: 65–70 and The Propensity Interpretation of Probability (1959) British Journal of the Philosophy of Science, 10: 25–42
UNREAD Introduces the idea of propensities. Interestingly, for Popper, quantum mechanics provides a strong motivation for the need for single-case probabilities.
Karl R. Popper, The Logic of Scientific Discovery
(Routledge Classics 2002)
UNREAD Chapter 8 explains his views on probability in comparison other approaches.

David Lewis, Philosophical Papers: Volume II
(OUP 1986). Also available online if you have a subscription to Oxford Scholarship Online.
UNREAD Contains a reprint of the 1980 paper that introduced the principal principle, as well as a paper on conditional probabilities.

Application to the Problem of Induction and Philosophy of Science

Most of the philosophy-based introductory texts cover this subject, but these texts are specifically focussed on understanding the scientific method.

John Earman, Bayes or Bust? A Critical Examination of Bayesian Confirmation Theory
(MIT Press 1992)
READ This is a landmark book in the philosophy of Bayesian inference as an approach to the confirmation of scientific theories. Earman is a Bayesian (at least sometimes) and he gives an honest appraisal of the successes and failures of Bayesianism in this context. The book starts with an analysis of Bayes’ original paper, which I did not personally find very interesting because I am more interested in contemporary theory than historical analysis. The second chapter gives a good overview of Bayesian methodology. The remainder of the book discusses the successes and failures of the Bayesian approach. I found the discussion of “convergence to the truth” results, done in terms of martingale theory, particularly insightful, since I have only previously seen this discussed in terms of exchangeability, and the assumptions in the martingale approach seem more reasonable (at least for non-statistical hypotheses which is all that Earman discusses). Earman argues that the “problem of old evidence” is subsumed into the more general problem of how to incorporate new theories into a Bayesian analysis. It is the latter that is the real obstacle to obtaining scientific objectivity in the Bayesian approach. Also interesting is the discussion of the role of a more sophisticated version of Sherlock Holmes style eliminative induction in science, illustrated with a case study of experimental tests of General Relativity. At the end of the book, the Bayesian approach is compared to formal learning theory, with neither really winning the battle. Subject to a mathematical conjecture, Earman shows that formal learning theory does not really have an edge of Bayesianism. Learning theory has developed significantly since the publication of this book, so it would be interesting to see where this debate stands today. The conclusion of the book is rather pessimistic. Bayesianism seems to provide a better account of scientific inference than its rivals, but it does not really license scientific objectivity.
Colin Howson and Peter Urbach, Scientific Reasoning: The Bayesian Approach
, 2nd edition (Open Court 1993)
READ This is a great book that argues for a Bayesian approach to scientific methodology. Most of the other major approaches to probability are well-criticized and it reads well as an introduction to the whole area. I would have liked to see more mathematical detail in some sections, but this is a book for philosophy students and it does have good pointers to the literature where you can follow up on details. Particularly insightful are the chapters that criticize classical statistical methodology, e.g. estimators, confidence intervals, least-squares regression, etc. This goes far beyond the usual myopic focus on idealized coin-flips and covers many topics that are relevant to the design of real scientific experiments, e.g. randomization, sampling, etc. My only complaint is that they kind of wuss out at the end of the book by arguing for a von-Mises style relative-frequency interpretation of objective chances, connecting it to Bayesian probabilities by an asymptotic Dutch book argument that I found unconvincing because it does not refer to a bet whose outcome can be decided (similar remarks apply to the argument for countable additivity). Despite this reservation, this book is valuable ammunition for researchers who want to be Bayesian about everything.
Brian Skyrms, Choice and Chance: An Introduction to Inductive Logic
, 4th edition (Wadsworth 1999)
READ This is not a text about probability per-se, but about how to go about formulating a calculus of inductive inference, in close parallel to the calculus of deductive logic. The usual problems of induction are extensively discussed, so this would be a great companion to a first course in the philosophy of science. Probability is introduced towards the end of the book and, of couse, the whole approach taken by this book biases the discussion towards logical (Keynes/Carnap) approaches to probability. Ultimately, I think that the problems addressed by this book are best treated by a subjective Bayesian approach, and that the construction of an objective calculus of induction is doomed to failure. However, a lot is learnt from the attempt so I would heartily recommend this book to new students of the foundations of scientific methodology.


Krzysztof Burdzy, The Search for Certainty: On the Clash of Science and Philosophy of Probability
(World Scientific 2009)
UNREAD I do love a good rant and Burdzy certainly seems to have a lot of them stored up when it comes to the foundations of probability. He argues that neither the frequentist or subjectivist foundation can account for the practice of actual probabilists and statisticians. He also offers his own account, but the criticism seems to be the main point.

Mathematical Foundations

Eventually, a bit of rigorous measure-theoretic probability is needed, so…

A. N. Kolmogorov, Foundations of the theory of probability
, Second English Edition (Chelsea 1956)
UNREAD Classic text from the originator of measure-theoretic probability.
David Williams, Probability with Martingales
(CUP 1991)
UNREAD A lively modern treatment of rigorous probability theory that has been recommended to me many times.

The Church Of The Smaller Hilbert Space Needs Your Vote!

Chad Orzel has posted a poll about the “religious” beliefs of scientists. In case you need a reminder, this is the Church of the Larger Hilbert Space and this is the Church of the Smaller Hilbert Space (see my comment on Chad’s post for further discussion).

Use your vote wisely.

Time Travel and Information Processing

Lately, the quant-ph section of the arXiv has been aflurry with papers investigating what would happen to quantum information processing if time travel were possible (see the more recent papers here). I am not sure exactly why this topic has become fashionable, but it may well be an example of the Bennett effect in quantum information research. That is, a research topic can meander along slowly at its own pace for a few years until Charlie Bennett publishes an (often important) paper[1] on the subject and then everyone is suddenly talking and writing about it for a couple of years. In any case, there have been a number of counter-intuitive claims that time travel enables quantum information processing to be souped up. Specifically, it supposedly enables super-hard computational problems that are in complexity classes larger than NP to be solved efficiently[2][3][4][5] and it supposedly allows nonorthogonal quantum states to be perfectly distinguished[2][6]. These claims are based on two different models for quantum time-travel, one due to David Deutsch[7] and one due to a multitude of independent authors based on post-selected teleportation (this paper[8] does a good job of the history in the introduction).

In this post, I am going to give a basic introduction to the physics of time-travel. In later posts, I will explain the Deutsch and teleportation-based models and evaluate the information processing claims that have been made about them. What is most interesting to me about this whole topic, is that the correct model for time travelling quantum systems, and hence their information processing power, seems to depend sensitively on both the formalism and the interpretation of quantum theory that is adopted[9]. For this reason, it is a useful test-bed for ideas in quantum foundations.

Basic Concepts of Time-Travel

Everyone is familiar with the sensation of time-travel into the future. We all do it at a rate of one second per second every day of our lives. If you would like to speed up your rate of future time travel, relative to Earth, then all you have to do is take a space trip at a speed close to the speed of light. When you get back, a lot more time will have elapsed on Earth than you will have experienced on your journey. This is the time-dilation effect of special relativity. Therefore, the problem of time-travel into the future is completely solved in theory, although in practice you would need a vast source of energy in order to accelerate yourself fast enough to make the effect significant. It also causes no conceptual problems for physics, since we have a perfectly good framework for quantum theories that are compatible with special relativity, known as quantum field theory.

On the other hand, time travel into the past is a much more tricky and conceptually interesting proposition. For one thing, it seems to entail time-travel paradoxes, such as the grandfather paradox where you go back in time and kill your grandfather before your parents were born, so that you are never born, so that you cannot go back in time and kill your grandfather, so that you are born, so that you can go back in time and kill your grandfather etc. (see this article for a philosophical and physics-based discussion of time travel paradoxes). For this reason, many physicists are highly sceptical of the idea that time travel into the past is possible. However, General Relativity (GR) provides a reason to temper our skepticism.

Closed Timelike Curves in GR

It has been well-known for a long time that GR admits solutions that include closed timelike curves (CTCs), i.e. world-lines that return to their starting point and loop around. If you happened to be travelling along a CTC then you would eventually end up in the past of where you started from. Actually, it is a bit more complicated than that because the usual notions of past and future do not really make sense on a CTC. However, imagine what it would look like to an observer in a part of the universe that respects causality in the usual sense. First of all, she would see you appear out of nowhere, claiming to have knowledge of events that she regards as being in the future. Some time later she would see you disappear out of existence. From her perspective it certainly looks like time-travel into the past. What things would feel like from your point of view is more of a mystery, as the notion of a CTC makes a mockery of our usual notion of “now”, i.e. it is a fundamentally block-universe construct.

The possibility of CTCs in GR was first noticed by Willem van Stockum in 1937[10] and later by Kurt Gödel in 1949[11]. Perhaps the most important solution that incorporates CTCs is the Kerr vacuum, which is the solution that describes an uncharged rotating black hole. Since most black holes in the universe are likely to be rotating, there is a sense in which one can say that CTCs are generic. The caveat is that the CTCs in the Kerr vacuum only occur in the interior of the black hole so that the physics outside the event horizon respects causality in the usual sense. Many physicists believe that the CTCs in the Kerr vacuum are mathematical artifacts, which will perhaps not occur in a full theory of quantum gravity. Nevertheless, the conceptual possibility of CTCs in General Relativity is a good reason to look at their physics more closely.

There have been a few attempts to look for solutions of GR that incorporate CTCs that a human being would actually be able to travel along without getting torn to pieces. This is a bit beyond my current knowledge, but, as far as I am aware, all such solutions involve large quantities of negative energy, so they are unlikely to exist in nature and it is unlikely that we can construct them artificially. For this reason, CTCs are currently more of a curiosity for foundationally inclined physicists like myself than they are a practical method of time-travel.

Other Retrocausal Effects in Physics

Apart from GR, other forms of backwards-in-time, or retrocausal, effect have been proposed in physics from time to time. For example, there is the Wheeler-Feynman absorber theory of electrodynamics, which postulates a backwards-in-time propagating field in addition to the usual forwards-in-time propagating field, and Feynman also postulated that positrons might be electrons travelling backwards in time. There is also Cramer’s transactional interpretation of quantum theory[12], which does a similar thing with quantum wavefunctions, and the distinct, but conceptually similar, two-state vector formalism of Aharonov and collaborators[13]. Finally, retrocausal influences have been suggested as a mechanism to reproduce the violations of Bell-inequalities in quantum theory without the need for Lorentz-invariance violating nonlocal influences[14].

However, none of these proposals are as compelling an argument for taking the physics of time-travel into the past seriously as the existence of CTCs in General Relativity. This is because, none of these theories gives provides a method for exploiting the retrocausal effect to actually travel back in time. Also, in each case, there is an alternative approach to the same phenomena that does not involve retrocausal influences. Nevertheless, it is possible that the models to be discussed have applications to these alternative approaches to physics.

Consistency Constraints and The Interpretation of Quantum Theory

Any viable theory of time travel into the past has to rule out things like the grandfather paradox. Consistency conditions have to be imposed on any physical model to so that time-travel cannot be used to alter the past. This raises interesting questions about free will, e.g. what exactly stops someone from freely deciding to pull the trigger on their grandfather? Whilst these questions are philosophically interesting, physicists are more inclined to just lay out the mathematics of consistency and see what it leads to. The different models of quantum time travel are essentially just different methods of imposing this sort of consistency constraint on quantum systems.

That is pretty much it for the basic introduction, but I want to leave you with a quick thought experiment to illustrate the sort of quantum foundational issues that come up when considering time-travel into the past. Suppose you prepare a spin-\(\frac{1}{2}\) particle in a spin up state in the z direction and then measure it in the x direction, so that it has a 50-50 chance of giving the spin up or spin down outcome. After observing the outcome you jump onto a CTC, travel back into the past and watch yourself perform the experiment again. The question is, would you see the experiment have the same outcome the second time around?

A consistency condition for time travel has to say something like “the full ontic state (state of things that exist in reality) of the universe must be the same the second time round as it was the first time round”, albeit that your subjective position within it has changed. If you believe, as many-worlds supporters do, that the quantum wavefunction is the complete description of reality then it, and only it, must be the same the second time around. Therefore, it must be the case that the probabilities are still 50-50 and you could see either outcome. This is not inconsistent because the many-worlds supporters believe that both outcomes happened the first time round in any case. If you are a Bohmian then the ontic state includes the positions of all particles in addition to the wavefunction and these, taken together, can be used to determine the outcome of the experiment uniquely. Therefore, a Bohmian must believe that the measurement outcome has to be the same the second time around. Finally, if you are some sort of anti-realist neo-Copenhagen type then it is not clear exactly what you believe, but, then again, it is not clear exaclty what you believe even when there is no time-travel.

There are some subtleties in these arguments. For example, it is not clear what happens to the correlations between you and the observed system when you go around the causal loop. If they still exist then this may restrict the ability of the earlier version of you to prepare a pure state. On the other hand, perhaps they get wiped out or perhaps your memory of the outcome gets wiped. The different models for the quantum physics of CTCs differ on how they handle this sort of issue, and this is what I will be looking at in future posts. If you have travelled along a CTC and happen to have brought a copy of these future posts with you then I would be very grateful if you could email them to me because that would be much easier for me than actually writing them.

‘Till next time!


  1. Bennett, C. H. et. al. (2009). “Can closed timelike curves or nonlinear quantum mechanics improve quantum state discrimination or help solve hard problems”. Phys. Rev. Lett. 103:170502. eprint arXiv:0908.3023. []
  2. Brun, T. A. and Wilde, Mark M. (2010). “Perfect state distinguishability and computational speedups with postselected closed timelike curves”. eprint arXiv:1008.0433. [] []
  3. Aaronson, S. and Watrous, J. (2009). Closed timelike curves make quantum and classical computing equivalent. Proc. R. Soc. A 465:631-647. eprint arXiv:0808.2669. []
  4. Bacon, D. (2004). Quantum Computational Complexity in the Presence of Closed Timelike Curves. Phys. Rev. A 70:032309. eprint arXiv:quant-ph/0309189. []
  5. Brun, T. A. (2003). Computers with closed timelike curves can solve hard problems. Found. Phys. Lett. 16:245-253. eprint arXiv:gr-qc/0209061. []
  6. Brun, Todd A., Harrington, J. and Wilde, M. M. (2009). “Localized closed timelike curves can perfectly distinguish quantum states”. Phys. Rev. Lett. 102:210402. eprint arXiv:0811.1209. []
  7. Deutsch, D. (1991). “Quantum mechanics near closed timelike lines”. Phys. Rev. D 44:3197—3217. []
  8. Lloyd, S. et. al. (2010). “The quantum mechanics of time travel through post-selected teleportation”. eprint arXiv:1007.2615 []
  9. I should mention that Joseph Fitzsimons (@jfitzsimons) disagreed with this statement in our Twitter conversations on this subject, and no doubt many physicists would too, but I hope to convince you that it is correct by the end of this series of posts. []
  10. Stockum, W. J. van (1937). “The gravitational field of a distribution of particles rotating around an axis of symmetry”. Proc. Roy. Soc. Edinburgh A 57: 135. []
  11. Kurt Gödel (1949). “An Example of a New Type of Cosmological Solution of Einstein’s Field Equations of Gravitation”. Rev. Mod. Phys. 21: 447. []
  12. Cramer, J. G. (1986). “The transactional interpretation of quantum mechanics”. Rev. Mod. Phys. 58:647-687. []
  13. Aharonov, Y. and Vaidman, L. (2001). “The Two-State Vector Formalism of Quantum Mechanics: An Updated Review”. in “Time in Quantum Mechanics”, Muga, J. G., Sala Mayato, R. and Egusquiza, I. L. eprint arXiv:quant-ph/0105101. []
  14. For example, see Price, H. (1997). “Time’s Arrow and Archimedes’ Point”. OUP. []